INDEX
Explanations
the word "Tak" followed by a high numerical value
repeated mentions of the name "Tak"
New Auto-Interp
Negative Logits
afort
-0.81
iths
-0.77
pard
-0.72
ecast
-0.72
cedented
-0.69
sheets
-0.69
umption
-0.68
umn
-0.67
mble
-0.67
ffield
-0.66
POSITIVE LOGITS
UCK
0.82
istani
0.79
ota
0.77
ICLE
0.74
atsuki
0.69
rade
0.69
Hutchinson
0.68
Ame
0.67
umar
0.67
uning
0.66
Activations Density 0.075%