INDEX
Explanations
references to humans or human beings in various contexts
New Auto-Interp
Negative Logits
setVerticalGroup
-0.78
oriasis
-0.66
enderror
-0.63
Persons
-0.62
perſon
-0.62
Laufe
-0.59
Persons
-0.59
copg
-0.59
Становништво
-0.59
persons
-0.59
POSITIVE LOGITS
humanity
1.21
humans
1.17
Humans
1.05
Humans
1.03
Humanity
1.00
humanos
0.89
mankind
0.88
manusia
0.87
humans
0.86
umani
0.85
Activations Density 0.090%