INDEX
Explanations
significant political statements and claims
New Auto-Interp
Negative Logits
öl
-0.16
ogui
-0.15
chandle
-0.14
acters
-0.14
@student
-0.13
panse
-0.13
ASK
-0.13
TestingModule
-0.13
.Symbol
-0.13
konusu
-0.13
POSITIVE LOGITS
statement
0.57
statements
0.46
remarks
0.43
remark
0.42
statement
0.42
comments
0.42
comment
0.41
Statement
0.41
speech
0.40
Statement
0.40
Activations Density 0.140%