LMPX.COM |
Home | Linux | Mysql | PHP | XML | ||
|
|
|||
From: Ken Williams Date: Mon Jan 8 21:38:10 2007 Subject: Re: text categorization with SVM and NaiveBayes
On Jan 8, 2007, at 10:51 AM, Tom Fawcett wrote: > Just to add a note here: Ken is correct -- both NB and SVMs are > known to be rather poor at providing accurate probabilities. Their > scores tend to be too extreme. Producing good probabilities from > these scores is called calibrating the classifier, and it's more > complex than just taking a root of the score. There are several > methods for calibrating scores. The good news is that there's an > effective one called isotonic regression (or Pool Adjacent > Violators) which is pretty easy and fast. The bad news is that > there's no plug-in (ie, CPAN-ready) perl implementation of it (I've > got a simple implementation which I should convert and contribute > someday). > > If you want to read about classifier calibration, google one of > these titles: > > "Transforming classifier scores into accurate multiclass > probability estimates" > by Bianca Zadrozny and Charles Elkan > > "Predicting Good Probabilities With Supervised Learning" > by A. Niculescu-Mizil and R. Caruana Cool, thanks for the references. It might be nice to add somesuch scheme to Algorithm::NaiveBayes (and friends), so that the user has a choice of several normalization schemes, including "none". If I get a surplus of tuits I'll add it, or if you feel like contributing your stuff that would be great too. -Ken
| Navigate in group perl.ai at sever nntp.perl.org | |
| Previous | Next |
| © No Copyright You are free to use Anything |
Site Maintained by PHP Developer
Powered By PHP Consultants |