INDEX
Explanations
specific details within a longer text, in this case, it activates for the word 'has'
the presence of Pokémon or gaming terminology related to actions
New Auto-Interp
Negative Logits
esty
-0.73
Bank
-0.66
elsen
-0.64
wig
-0.62
Forms
-0.61
PASS
-0.60
ãĥŁ
-0.60
minist
-0.60
UNHCR
-0.60
ocide
-0.60
POSITIVE LOGITS
ordering
0.69
'>
0.68
anza
0.65
pects
0.64
bral
0.62
lished
0.62
keyes
0.61
ourage
0.59
beard
0.57
Mellon
0.57
Activations Density 0.000%