INDEX
Explanations
instances of the word "known" and its variations to indicate fame or recognition
New Auto-Interp
Negative Logits
toa
-0.16
il
-0.16
pie
-0.16
ocha
-0.15
ements
-0.15
saw
-0.14
ement
-0.14
emit
-0.14
gin
-0.14
arov
-0.14
POSITIVE LOGITS
known
0.19
known
0.18
throughout
0.18
ahir
0.17
equally
0.16
PartialView
0.16
among
0.16
Known
0.16
ptune
0.15
amongst
0.15
Activations Density 0.026%