INDEX
Explanations
occurrences of the term "gorilla" in various contexts
New Auto-Interp
Negative Logits
paren
-0.83
ply
-0.80
ensible
-0.80
log
-0.78
daq
-0.78
schild
-0.75
leness
-0.72
orough
-0.71
iosyn
-0.71
mare
-0.69
POSITIVE LOGITS
ieri
0.98
istas
0.88
Haram
0.82
esi
0.82
usk
0.79
Ammunition
0.79
ñ
0.78
ista
0.76
ume
0.76
ength
0.75
Activations Density 0.006%