Natural Language Processing Products

Envion’s experience in the development of proprietary NLP software spans over three decades, dating back to 1985 when we started developing Dashes hyphenator, our first product in the company’s line of NLP products. Dashes became a great success and is now used by a large number of businesses and universities around the globe.

Presently, Envion’s NLP product line comprises two main products: WordFan NLP Toolbox and Dashes Hyphenator. Both products are notable for their precision and processing speed (achieved by applying several proprietary inventions and using a highly qualified in-house linguistic team to process the extensive word lists employed by our software).

The processing quality provided by our NLP software has put on our customer list a number of leading technology vendors and publishers, including Microsoft, Adobe, Quark, Oracle, Advanced Publishing Technologies, P.INK, Atex, Bitstream, Celex, CompuSense, Concept Publishing, Dalim, FutureOfNews,  ICON Technologies, InContext, IPA Systems, Linotype, Media Service Group, Monotype, NewsNet,  PAGE Systems, Pre1, NEWSCYCLE Solutions, RagTime, Publishing Business Systems, Spyglass,  Synaptic, Turn-Key Systems, US West Dex, Winsoft, Wright Technology, DRI, Diwan, Delve, and many others.

 

WordFan NLP ToolBox

WordFan Natural Language ToolBox is an easy-to-use but extremely knowledge-intensive multilingual stemming library.

The software allows for the in-depth exploration of the entire range of paradigmatic relations of a lexical unit.

There are several types of lookup supported by the library to allow the user to trace the linguistic relations of a word. Depending on the type of lookup selected, the following types of results can be shown:

 

  • All the forms of the input word with their grammatical meaning (conjugation).
  • Base forms of the input word with their grammatical meaning (normalization).
  • All forms similar to the input (approximate lookup).
  • All the exact matches of the input word (exact lookup).
  • Constituent elements of the input compound word (decompounding).

A detailed grammatical description for each of the displayed derivatives is provided. It includes information that ranges from the word’s part of speech to its category of case.

WordFan’s processing speed varies depending on the language and request type: from around 2,000 words per second for Approximate lookup, to more than 100,000 words per second for Exact lookup.

Currently, the library supports 6 languages: Arabic, Danish, English (AmE, BrE, AusE), German (both modern and pre-reformed), French, Russian, and Polish. The lexical coverage of the languages provided by the library is very broad and it has been meticulously optimized by Envion’s linguistic experts to avoid any possible ambiguity and misleading overlaps.

Additionally, WordFan includes spell-checking functionality, and it can also be used in conjunction with Envion’s Dashes Hyphenator.

 

Dashes Hyphenator

Dashes is an innovative hyphenation library that is based on one of Envion’s proprietary algorithms. Inventing this breakthrough algorithm has allowed us to algorithmize the convoluted language patterns that regulate the lexical stress in different languages and ensure proper hyphenation in 31 tongues.

The library ranks the suggested hyphens in accordance with their stylistic value. Non-standard lexical units are precisely hyphenated using a custom set of rules specifically designed by Envion’s linguistic experts.

Currently, Dashes’ processing speed constitutes more than 100,000 words per second with 99.9% hyphenation accuracy. The library is able to properly hyphenate Germanic compound words by breaking them down into constituent elements.

 

To learn more about our NLP product line, or to purchase a product, please send an email to envioninfo@envionsoftware.com.

You can also get familiar with a detailed Case Study on the development of Envion’s NLP product line.

 

© Envion Software, 2015

 

 

Contact Us

captcha