INDEX
Explanations
expressions related to governmental or institutional critique
New Auto-Interp
Negative Logits
udah
-0.72
kinda
-0.71
чтоб
-0.70
eachother
-0.66
ज़
-0.64
youll
-0.64
”…
-0.62
…’
-0.60
loosing
-0.60
sorta
-0.60
POSITIVE LOGITS
hon
0.70
Хочу
0.67
Column
0.63
-[
0.62
Government
0.61
◄
0.60
SHRI
0.60
Parliament
0.58
Mr
0.57
"""
0.56
Activations Density 0.068%