INDEX
Explanations
female pronouns
references to authors and their interpretations
New Auto-Interp
Negative Logits
gettable
-0.83
VERTISEMENT
-0.80
srfAttach
-0.77
Leban
-0.77
Ranked
-0.69
neighb
-0.68
neath
-0.67
Decay
-0.66
Leilan
-0.66
Goo
-0.65
POSITIVE LOGITS
misunderstand
1.12
misinterpret
1.08
miscon
1.04
quote
1.03
misrepresent
1.03
phr
1.02
misunderstood
1.02
quoting
0.98
correctly
0.97
exagger
0.94
Activations Density 0.588%