INDEX
Explanations
expressions of honesty or frankness
honesty and frankness
New Auto-Interp
Negative Logits
stination
-0.50
landais
-0.46
جمعیت
-0.44
arşivlendi
-0.44
wireType
-0.44
ligiloj
-0.43
imc
-0.43
isolado
-0.43
VIDEOT
-0.43
PyExc
-0.43
POSITIVE LOGITS
frankly
0.77
Honestly
0.76
Frankly
0.74
honestly
0.70
Honestly
0.70
honestly
0.69
Tbh
0.59
tbh
0.58
正直
0.56
honest
0.54
Activations Density 0.009%