INDEX
Explanations
quantitative data related to victims and refugees in conflicts
New Auto-Interp
Negative Logits
itſelf
-0.90
pleaſure
-0.90
houſe
-0.86
Efq
-0.86
chofe
-0.86
ſeveral
-0.84
myſelf
-0.83
ſtate
-0.81
Majefty
-0.81
Monfieur
-0.80
POSITIVE LOGITS
off
0.52
بوابة
0.50
"
0.49
di
0.48
exe
0.47
fully
0.47
Fla
0.47
shit
0.45
form
0.44
T
0.44
Activations Density 0.027%