topical media & game development

talk show tell print

#mobile-game-ch22-hangman-agid.txt / txt



  Automatically Generated Inflection Database (AGID)
  
  August 19, 2000
  Revision 2
  
  Copyright 2000 by Kevin Atkinson <kevina@users.sourceforge.net>
  
  The file "infl.txt" is an automatically created database of the
  inflected forms of words from an insanely large word list.
  
  The latest version can be found at http://aspell.sourceforge.net/wl/.
  
  Entries are in the following form.
  
  <word> <pos>: <inflected forms>
  
  Where <pos> is V for verb, N for noun, or A or adjective or adverb.
  If <pos> is followed by a ? that means that the part-of-speech was not
  in the part-of-speech database however the inflected forms of the word
  where found in the word list.
  
  The inflected forms are in the following order for verbs (except for
  the verb "be"):
    <past tense>  [<past participle>]  <-ing form>  <plural form>
  and for adjective or adverbs:
    <-er form>  <-est form>
  There are two spaces between each form.
  
  A word in parentheses mean that it is considered a less preferred form
  of the previous inflection.  Two parentheses means that the word is
  even less preferred, etc.  A / between two words means that the two
  words are considered almost equal variants or that is is difficult to
  tell which one is the primary form.  They are ordered by preference
  however sometime this distinction is so slight it is meaningless.  A
  "|" between words means that both inflections are used depending on
  the meaning of the word.  If the distinction between the two forms can
  be described in a word than that word is found after the word in
  braces, for example:
  
    hang V: hung {suspend} | hanged {execute}  hanging  hangs
  
  Notice how there is two spaces between the past tense, -ing form and
  plural form but not between the alternate forms of the past tense.  In
  general, if the "|" symbol would be needed more than once the words
  the entry is split up into multiple lines like so:
  
    <word> [{explanation}] <POS>: <inflected forms>
  
  However, the past particle as past tense form are considered a single
  form. Thus, a "|" may appear more than once when the word contains
  both a past participial and past tense form.
  
  A /? between words means that both inflections were found in the word
  list but the script was not sure which one to use.  A ~ after a word
  means that there is a slight chance that it is the plural of a word.
  A ! after a word indicates that the word is likely an inflections of a
  similar word (generally one ending in e) and not the current word.  A
  ? after a word means that the word was not in the word list but if it
  was it would be considered an inflected form of the base word.
  
  Fell free to send me corrections to correct any of these questionable
  words.  I am mostly interested in the preferred form of the word in
  the case of /? or words marked with a ~ that are actually valid.
  
  Words are in mixed case but all accents have been scripted thus words
  like café are instead cafe.
  
  The file "variant" contains a list of alternate inflections.
  
  The file "irregular" contains extra information where a noun or verb
  has irregular inflected forms.
  
  The file "dontuse" contains a list of words not to consider an
  inflected form of a word if more than one inflected form of a word is
  found.
  
  The files "prefixes" and "suffixes" contains a list of common prefixes
  and suffixes respectfully.  These files are used by the script to
  produce inflected forms for words that end in a word in the
  "irregular" file. If the beginning appears in the word list or the
  prefixes file and the ending appears in the irregular file I also
  consider <prefix>+<irregular inflections>.  If the prefix is 3 letters
  or more OR appears in the prefixes file and the suffix is 4 letters or
  more OR appears in the suffixes file I consider it the most likely
  choice, otherwise I consider it as a possible candidate but not the
  most likely choice.
  
  The file "make-infl" is the actual Perl script used to create the
  data base.
  
  CHANGES:
  
  From Revision 1 to 2 (August 18, 2000)
  
    Classified variants as either almost equal, also used, or
    secondary.
  
    The / is now used to indicate equal variants.  "/?" is now used to
    mean what "/" used to be.
  
    Lots of additional rules added which greatly improved the results.
  
  COPYRIGHT AND SOURCE:
  
  The final product is under the following copyright, as well as any
  copyrights mentioned below.
  
    Copyright 2000 by Kevin Atkinson
  
    Permission to use, copy, modify, distribute and sell this database,
    the associated scripts, the output created form the scripts and its
    documentation for any purpose is hereby granted without fee,
    provided that the above copyright notice appears in all copies and
    that both that copyright notice and this permission notice appear in
    supporting documentation. Kevin Atkinson makes no representations
    about the suitability of this array for any purpose. It is provided
    "as is" without express or implied warranty.
  
  The part-of-speech database used is created form the Moby
  part-of-speech database which is in the public domain:
  
      The Moby lexicon project is complete and has
      been place into the public domain. Use, sell,
      rework, excerpt and use in any way on any platform.
      
      Placing this material on internal or public servers is
      also encouraged. The compiler is not aware of any
      export restrictions so freely distribute world-wide.
      
      You can verify the public domain status by contacting
      
      Grady Ward
      3449 Martha Ct.
      Arcata, CA  95521-4884
      
      grady@netcom.com
      grady@northcoast.com
  
  and the WordNet database which is under the following copyright:
  
      This software and database is being provided to you, the LICENSEE, by
      Princeton University under the following license.  By obtaining, using  
      and/or copying this software and database, you agree that you have  
      read, understood, and will comply with these terms and conditions.:  
    
      Permission to use, copy, modify and distribute this software and
      database and its documentation for any purpose and without fee or
      royalty is hereby granted, provided that you agree to comply with  
      the following copyright notice and statements, including the disclaimer,  
      and that the same appear on ALL copies of the software, database and  
      documentation, including modifications that you make for internal  
      use or for distribution.  
    
      WordNet 1.6 Copyright 1997 by Princeton University.  All rights reserved.  
    
      THIS SOFTWARE AND DATABASE IS PROVIDED "AS IS" AND PRINCETON  
      UNIVERSITY MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR  
      IMPLIED.  BY WAY OF EXAMPLE, BUT NOT LIMITATION, PRINCETON  
      UNIVERSITY MAKES NO REPRESENTATIONS OR WARRANTIES OF MERCHANT-  
      ABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE  
      OF THE LICENSED SOFTWARE, DATABASE OR DOCUMENTATION WILL NOT  
      INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR  
      OTHER RIGHTS.
    
      The name of Princeton University or Princeton may not be used in  
      advertising or publicity pertaining to distribution of the software  
      and/or database.  Title to copyright in this software, database and  
      any associated documentation shall at all times remain with  
      Princeton University and LICENSEE agrees to preserve same.  
  
  The word list used is a combination of several word list:
  
  1) Most of the word lists from the Moby Words package:
  
       10196pla.ces 113809of.fic 21986na.mes 256772co.mpo 354984si.ngl
       3897male.nam 4160offi.cia 4946fema.len 6213acro.nym 74550com.mon
     
     The Moby Word package, like the Part-Of-Speech database is in the
     public domain.
  
  2) The ENABLE2K word lists which is in the public domain:
  
       The ENABLE master word list, WORD.LST, is herewith formally
       released into the Public Domain. Anyone is free to use it or
       distribute it in any manner they see fit. No fee or registration
       is required for its use nor are "contributions" solicited (if you
       feel you absolutely must contribute something for your own peace
       of mind, the authors of the ENABLE list ask that you make a
       donation on their behalf to your favorite charity). This word
       list is our gift to the Scrabble community, as an alternate to
       "official" word lists. Game designers may feel free to
       incorporate the WORD.LST into their games. Please mention the
       source and credit us as originators of the list. Note that if
       you, as a game designer, use the WORD.LST in your product, you
       may still copyright and protect your product, but you may *not*
       legally copyright or in any way restrict redistribution of the
       WORD.LST portion of your product. This *may* under law restrict
       your rights to restrict your users' rights, but that is only
       fair.
  
  3) All of the word lists in the ENABLE2K Supplemnt which consists of:
  
       2DICTS.LST  ALSO.LST   LETTERS.LST  OSPDADD.LST  UCACR.LST
       ABLE.LST    LCACR.LST  NOPOS.LST    PLURALS.LST  UPPER.LST
  
     All of these word lists are also in the public domain.
  
  4) The list of signature words from the YAWL package which is in the
     public domain.
  
  5) The UK Advanced Cryptics Dictionary which in under the following
     copyright:
  
       Copyright (c) J Ross Beresford 1993-1999. All Rights Reserved.
  
       The following restriction is placed on the use of this
       publication: if The UK Advanced Cryptics Dictionary is used
       in a software package or redistributed in any form, the
       copyright notice must be prominently displayed and the text
       of this document must be included verbatim.
  
       There are no other restrictions: I would like to see the
       list distributed as widely as possible.
  
  6) Some extra words found in the Part-Of-Speech database that was not
     found in any of the above word list.
  
  7) Words found in the Jargon File Word List package, available at
     http://aspell.sourceforge.net/wl/, which is in the Public Domain.
  
  8) And finally some extra words that I added myself.  These words can be
     found in the file "extra-words"
  
  The "dontuse", "irregular", and "variant" file was created by me
  (Kevin Atkinson) from numerous sources.
  
  


(C) Æliens 04/09/2009

You may not copy or print any of this material without explicit permission of the author or the publisher. In case of other copyright issues, contact the author.