INDEX
Explanations
mentions of specific individuals and organizations involved in political or legal contexts
New Auto-Interp
Negative Logits
cat
-0.50
Make
-0.49
Q
-0.46
Make
-0.45
|
-0.44
สัม
-0.44
az
-0.44
cedo
-0.43
nem
-0.42
esty
-0.42
POSITIVE LOGITS
ſelf
0.94
purpoſe
0.92
Jefus
0.90
oredCriteria
0.87
featureID
0.84
Monfieur
0.84
Autoritní
0.82
faſt
0.82
myſelf
0.81
Chrif
0.81
Activations Density 0.128%