INDEX
Explanations
the word "Kal" with varying levels of activation strength
references to a specific individual named Kalashnikov
New Auto-Interp
Negative Logits
ãĥ¼ãĥĨãĤ£
-0.92
ãģį
-0.71
IBLE
-0.65
AGE
-0.64
ORTS
-0.63
çĶŁ
-0.62
UCT
-0.61
halftime
-0.61
degrade
-0.60
åij
-0.60
POSITIVE LOGITS
adesh
1.27
amaz
1.26
arov
1.01
endar
1.00
inda
0.97
ibr
0.97
iev
0.96
ita
0.95
endars
0.94
tering
0.93
Activations Density 0.018%