INDEX
Explanations
first-person pronouns and expressions of personal reflection or opinion
New Auto-Interp
Negative Logits
agar
-0.15
eld
-0.15
igan
-0.14
onds
-0.14
acha
-0.14
ilder
-0.14
ff
-0.14
ff
-0.14
ÅĻÃŃ
-0.13
Porter
-0.13
POSITIVE LOGITS
utto
0.17
èĩ
0.15
kraj
0.15
entarios
0.14
Hass
0.14
пиÑī
0.14
ë¯
0.14
Ëĺ
0.14
ADC
0.14
utow
0.13
Activations Density 0.076%