INDEX
Explanations
words related to specific names
names and references to specific individuals
New Auto-Interp
Negative Logits
kefeller
-0.78
theless
-0.71
ascript
-0.67
acebook
-0.64
Kissinger
-0.63
thirds
-0.61
cracked
-0.61
MPG
-0.59
Versions
-0.59
SOURCE
-0.58
POSITIVE LOGITS
uchi
1.00
atsu
0.90
uni
0.87
aki
0.87
ikuman
0.87
ashi
0.85
agi
0.84
Lumpur
0.83
awa
0.81
ushi
0.81
Activations Density 0.225%