INDEX
Explanations
proper nouns and specific phrases related to locations or institutions
instances of the word "OK"
New Auto-Interp
Negative Logits
craft
-0.73
omorphic
-0.70
minecraft
-0.68
brim
-0.67
clud
-0.66
oidal
-0.66
istics
-0.66
riber
-0.65
dimension
-0.65
bent
-0.65
POSITIVE LOGITS
AY
1.23
lahoma
1.16
awaru
0.92
YA
0.82
OK
0.79
WARD
0.77
ALLY
0.74
ESCO
0.72
SW
0.70
ANE
0.69
Activations Density 0.013%