Linguistic prescriptivism sucks

Following up on the previous post, here’s something fascinating. The developers of an AI project that is meant to provide a vast base of conceptual associations to help computers process text in English are trying to purge it of racism.

As I understand it, ConceptNet is meant to help your application parse incoming natural language speech. By definition, this will be a dip out of the pool of living English, for good or ill. And if you are training a machine learning algorithm to understand it better, the weights in the association graph are going to trend towards whatever the incoming speech corpus implies. In as much as the project is meant to comprehend English, it must be a descriptive one, and that means dealing with the language warts and all.

The important question is what the application does then. If your app is making judgments on the basis of word associations, it’s likely to end up being seriously prejudiced in some way or other. The purpose of the system is what it does, as Stafford Beer said; the problem with the system is also what it does. The D-word rules.

Post a comment

You may use the following HTML:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>