INDEX
Explanations
concepts and discussions around perceptions and interpretations
New Auto-Interp
Negative Logits
Efq
-0.83
ftill
-0.70
houſe
-0.65
первых
-0.62
Majefty
-0.62
againſt
-0.61
fubject
-0.60
siguran
-0.60
caufe
-0.59
ſeveral
-0.59
POSITIVE LOGITS
treated
0.97
regarded
0.93
treating
0.88
viewed
0.86
treats
0.81
Treated
0.80
treat
0.80
Treat
0.74
treated
0.73
treat
0.73
Activations Density 0.329%