INDEX
Explanations
expressions related to opinions and assessments
New Auto-Interp
Negative Logits
Majefty
-0.88
consuls
-0.86
Chriftian
-0.83
Beijinhos
-0.80
ValueStyle
-0.80
Atentamente
-0.79
themſelves
-0.79
Monfieur
-0.78
Houſe
-0.77
Efq
-0.77
POSITIVE LOGITS
its
0.75
it
0.68
It
0.64
easy
0.62
Its
0.62
snowing
0.61
Its
0.59
It
0.59
它
0.59
easier
0.59
Activations Density 0.431%