Home  |  Linux  | Mysql  | PHP  | XML
From:Ed Batutis Date:Wed Aug 13 08:50:11 2008
Subject:RE: Language detection
> Thanks Ed. Unless I've misunderstood, this is just doing charset
> detection, with language as a bonus when the charset implies it? 

That wouldn't be very useful. No, it uses recognizers for charset/language
combinations.

> difference between say English, French and German, all in UTF-8
> encoding, please let me know.

It does not have data to do any utf-8 language detection, but the structure
is in place.

You might want to consider adding data to their framework for what you want
to do. It isn't complicated. The most important thing you need is good
sample text in quantity so you can generate the n-gram probability table.

I believe the code was taken from Mozilla, so you might look there. Maybe
they've already done what you are looking for.

=Ed


Navigate in group php.i18n at sever news.php.net
Previous Next




  
© No Copyright
You are free to use Anything
Site Maintained by PHP Developer
Powered By PHP Consultants