Academic Publishing Wiki

Application Example[]

Flag Recognition[]

Although the following example lacks an intuitive sense of optimization it serves as a good example of how optimization can reduce the number of queries. While the method of optimal classification is highly beneficial for reducing the number of queries required for manual identification, automated identification may be better served by use of a neural network.

Flag overlay grid[]

Flag grid x

Designated areas (characteristics) for sampling background colors (states) of all flags (elements).

The overlay is used to determine the color of each area for each flag and the color is recorded in the table as the logical state of the area. The table data is then submitted to the optimization program and processed until an optimal empirical separatory value is obtained.

Data set[]

FLAGS/LOC,A,B,C,D,E,F,G,H,I
BELGIUM,BLACK,YELLOW,ORANGE,BLACK,YELLOW,ORANGE,BLACK,YELLOW,ORANGE
FRANCE,BLUE,WHITE,RED,BLUE,WHITE,RED,BLUE,WHITE,RED
GERMANY,BLACK,BLACK,BLACK,RED,RED,RED,YELLOW,YELLOW,YELLOW
IRELAND,GREEN,WHITE,ORANGE,GREEN,WHITE,ORANGE,GREEN,WHITE,ORANGE
ITALY,GREEN,WHITE,RED,GREEN,WHITE,RED,GREEN,WHITE,RED
JAPAN,WHITE,WHITE,WHITE,WHITE,RED,WHITE,WHITE,WHITE,WHITE
LUXEMBOURG,RED,RED,RED,WHITE,WHITE,WHITE,BABY,BABY,BABY
NETHERLANDS,RED,RED,RED,WHITE,WHITE,WHITE,BLUE,BLUE,BLUE
SPAIN,RED,RED,RED,YELLOW,YELLOW,YELLOW,RED,RED,RED

Original order[]

Original order flag example

Systematic query[]

Starting with area "A" the query begins by asking for the color in this area of the flag. Suppose we have in our possession the flag of the Netherlands. The answer to the first query in regard to area "A" is RED which would remove 2/3 of the flags from further consideration. The next query for the color in area "B" would be RED which would serve to eliminate none of the remaining flags. In fact, since the colors in columns "D", "E" and "F" are the same for each remaining flag, we would not be able to eliminate any remaining flags until column "G" where the color BLUE would provide a unique answer to the final necessary query. Here all remaining flags except the flag of the Netherlands would be eliminated from further consideration. It would therefore take a minimum of seven queries using the systematic query method to establish the identity of the flag in our possession as belonging to the Netherlands.

Optimized order[]

Optimized order flag examplex

Minimized query[]

The results of optimization are shown above and include a listing of the theoretical and empirical percentages. The original characteristic sequence is indexed in the bottom row. Starting with area "G" the query begins by asking for the color in this area of the flag. Suppose we have in our possession the flag of Ireland. The answer to this first query would be GREEN. The next query is for the color in area "F" to which we would answer ORANGE. Since no other flags have this combination of GREEN and ORANGE in these areas our query can end here. The method has minimized the number of required queries required to identify the flag by optimizing the order of characteristics. (Please note that there may be more than one minimal solution.)

Computer program GUI[]

Optimal classification

Name query[]

I have a list of names which serve as the dependent values of a truth table. The independent variables are the letters of the alphabet and the cells contain the number of each letter in the name.[1]

words a b c d e f g h i j k l m n o p q r s t u v w x y z
abell 1 1 0 0 1 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0
adamo 2 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0
adolphus 1 0 0 1 0 0 0 1 0 0 0 1 0 0 1 1 0 0 1 0 1 0 0 0 0 0
agostino 1 0 0 0 0 0 1 0 1 0 0 0 0 1 2 0 0 0 1 1 0 0 0 0 0 0
akai 2 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
albery 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0
alfonso 1 0 0 0 0 1 0 0 0 0 0 1 0 1 2 0 0 0 1 0 0 0 0 0 0 0
alli 1 0 0 0 0 0 0 0 1 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0
alteen 1 0 0 0 2 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0
amarjit 2 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0
amparo 2 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0 0 0 0
andrea 2 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0
anhtuan 2 0 0 0 0 0 0 1 0 0 0 0 0 2 0 0 0 0 0 1 1 0 0 0 0 0

. . .

wina 1 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0
witney 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 1 0
woodyer 0 0 0 1 1 0 0 0 0 0 0 0 0 0 2 0 0 1 0 0 0 0 1 0 1 0
wymard 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 1 0
xylia 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0
yasmin 1 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 1 0
yerga 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0
yolanda 2 0 0 1 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 1 0
youping 0 0 0 0 0 0 1 0 1 0 0 0 0 1 1 1 0 0 0 0 1 0 0 0 1 0
zadorozn 1 0 0 1 0 0 0 0 0 0 0 0 0 1 2 0 0 1 0 0 0 0 0 0 0 2
zarlenga 2 0 0 0 1 0 1 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 1
zenkner 0 0 0 0 2 0 0 0 0 0 1 0 0 2 0 0 0 1 0 0 0 0 0 0 0 1
zissis 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 1
zukas 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1
zywiel 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 1

Suppose we want to find the name "andrea" using the least number of queries.

Once the table is optimized all we need to do is to ask in the order determined by the program how many of each letter is in the name.

words a e i n r o l s t h d m c u k y g b p f w v z j q x
tateyama 3 1 0 0 0 0 0 0 2 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0
kamiyama 3 0 1 0 0 0 0 0 0 0 0 2 0 0 1 1 0 0 0 0 0 0 0 0 0 0
mcnamara 3 0 0 1 1 0 0 0 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0
natalie 2 1 1 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
sebastia 2 1 1 0 0 0 0 2 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
sheidafa 2 1 1 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
zarlenga 2 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0
maarten 2 1 0 1 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
andrea 2 1 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
reagan 2 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
lauderda 2 1 0 0 1 0 1 0 0 0 2 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
lehtovaa 2 1 0 0 0 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
mastella 2 1 0 0 0 0 2 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
salapek 2 1 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0
gianina 2 0 2 2 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0

First we ask how many "a"'s are in "andrea". The answer 2 eliminates 61.61% of our original number of choices found in the full list. Next question is how many "e"'s are there which answer is 1. This answer and the previous answer now eliminate 84.73% of our original possibilities. The answer to the number of "i"'s is zero and also in combination eliminates 93% of the original possibilities. In just 3 queries we have eliminated 93% of the original possibilities and may be able to start looking at adjacent entries to see if "andres" is in the remaining 7%.

Class exercises[]

Have the class make up a tentative list of attributes and a few attribute values such as leaf size and large, medium or small, or leaf shape and rounded, pointed, etc. in preparation for data collection.

Students can freely add attributes and/or values of their own.

Head for the field and have every student describe the attributes and attribute values of each plant with single words if possible and short phrases if not.

Return to the classroom and have all the students compare and merge attributes and/or attribute values to end up with as few attributes and attributes values as possible without hindering identification of a plant in any student’s mind.

Put the data into an excel spreadsheet with plant names in the left most column and the attributes in the top row with the attribute values in the cells as shown.

SCIENTIFIC-NAME	ACTIVE-GROWTH-PERIOD	AFTER-HARVEST-REGROWTH-RATE	ALLELOPATH	BLOAT	C-N-RATIO	COPPICE-POTENTIAL	FALL-CONSPICUOUS	FIRE-RESISTANCE
THAMNOSMA-MONTANA	SPRING-AND-FALL	-	NO	NONE	HIGH	YES	NO	NO
LUPINUS-ALBUS	FALL-WINTER-AND-SPRING	SLOW	NO	LOW	LOW	NO	YES	NO
DICHANTHELIUM-ACICULARE	SUMMER	SLOW	NO	NONE	-	NO	NO	NO
RUPPIA-MARITIMA	SPRING-AND-SUMMER	NONE	NO	NONE	LOW	NO	NO	NO
BOTHRIOCHLOA-BARBINODIS	SUMMER-AND-FALL	MODERATE	NO	NONE	MEDIUM	NO	NO	NO
HYPERICUM-DENTICULATUM	SUMMER	NONE	NO	NONE	MEDIUM	NO	NO	NO
PLEURAPHIS-MUTICA	SPRING-AND-SUMMER	MODERATE	NO	NONE	HIGH	NO	NO	NO
MIMULUS-DENTATUS	SPRING-AND-SUMMER	SLOW	NO	NONE	MEDIUM	NO	NO	NO
TRIFOLIUM-ALEXANDRINUM	SPRING-AND-SUMMER	SLOW	NO	NONE	LOW	NO	NO	NO
PIERIS-JAPONICA	SPRING-AND-SUMMER	-	NO	-	HIGH	NO	YES	YES
STENOTAPHRUM-SECUNDATUM	SUMMER-AND-FALL	RAPID	NO	NONE	MEDIUM	NO	NO	NO
FRAXINUS-CAROLINIANA	SPRING	-	NO	NONE	-	YES	YES	NO
NYMPHAEA-MEXICANA	SPRING	-	NO	-	MEDIUM	NO	NO	NO
LEUCOTHOE-AXILLARIS	SPRING	-	NO	-	MEDIUM	NO	NO	NO
PENNISETUM-FLACCIDUM	SUMMER-AND-FALL	MODERATE	NO	NONE	MEDIUM	NO	NO	NO
LATHYRUS-POLYPHYLLUS	SPRING-AND-SUMMER	SLOW	NO	NONE	LOW	NO	NO	NO
BAPTISIA-TINCTORIA	SUMMER	SLOW	NO	NONE	MEDIUM	NO	NO	NO
IVA-FRUTESCENS	SPRING-AND-SUMMER	-	NO	NONE	HIGH	NO	NO	NO
ZENOBIA-PULVERULENTA	SPRING-AND-SUMMER	RAPID	NO	-	HIGH	NO	NO	YES
PANICUM-ANTIDOTALE	SUMMER	MODERATE	NO	NONE	MEDIUM	NO	NO	NO
PHOTINIA-FRASERI	SPRING-AND-SUMMER	-	NO	-	HIGH	NO	YES	NO
ALNUS-MARITIMA	SPRING-AND-SUMMER	-	NO	NONE	MEDIUM	NO	YES	NO
NANDINA-DOMESTICA	SPRING-AND-SUMMER	-	NO	-	MEDIUM	NO	NO	NO

Submit the spreadsheet for processing by the computer.

  1. Names made of symbols with a predefined order have the advantage of being sorted on that order for each position in the name to achieve optimal (actually inherent) classification.