INDEX
Explanations
prepositions and conjunctions
connecting words and phrases that indicate relationships or associations
New Auto-Interp
Negative Logits
Buk
-0.76
Bundy
-0.73
Mate
-0.69
Baghd
-0.68
Tile
-0.68
Goo
-0.66
Truck
-0.66
Signal
-0.66
Knot
-0.65
Wad
-0.65
POSITIVE LOGITS
selves
1.06
theless
1.04
terday
0.99
ilogy
0.94
redients
0.93
fore
0.87
have
0.86
rontal
0.85
vernment
0.84
ards
0.82
Activations Density 0.178%