INDEX
Explanations
the presence of specific personal pronouns and references to individuals
New Auto-Interp
Negative Logits
ruba
-0.17
Forum
-0.15
reform
-0.15
κοÏħ
-0.14
McKenzie
-0.14
Mercer
-0.14
gr
-0.14
Reform
-0.14
Forum
-0.14
Duffy
-0.14
POSITIVE LOGITS
ouses
0.15
asl
0.15
GLE
0.15
STORE
0.14
ighton
0.14
ASA
0.14
isas
0.14
ASA
0.14
AIT
0.14
ARI
0.14
Activations Density 0.012%