INDEX
Explanations
mentions of koalas and their wellbeing
New Auto-Interp
Negative Logits
h
-0.33
e
-0.31
es
-0.30
y
-0.25
hu
-0.24
hh
-0.23
le
-0.22
hab
-0.22
hari
-0.22
hani
-0.22
POSITIVE LOGITS
úa
0.17
VERR
0.16
dzi
0.16
_Lean
0.16
ska
0.15
ÛĮÙħÛĮ
0.15
dre
0.15
ccione
0.14
zych
0.14
erland
0.14
Activations Density 0.137%