INDEX
Explanations
passionate proclamations or chants of loyalty and support towards political figures or ideologies
New Auto-Interp
Negative Logits
+#+
-0.64
writeFieldEnd
-0.61
arschijnlijk
-0.60
bezeichneter
-0.57
MessageOf
-0.56
himo
-0.54
незавершена
-0.54
старости
-0.54
perhaps
-0.54
ख़
-0.53
POSITIVE LOGITS
!"
0.82
!”
0.75
!'
0.73
!
0.71
!!!"
0.70
!’
0.67
!!!!
0.67
!!!!!!
0.66
!!!”
0.65
!!!!!
0.63
Activations Density 0.158%