INDEX
Explanations
expressions of concern or worry regarding issues
New Auto-Interp
Negative Logits
lite
-0.76
ingers
-0.75
nice
-0.71
Carbuncle
-0.70
Bom
-0.62
ortunately
-0.61
ctors
-0.59
WIN
-0.59
slick
-0.58
Adinida
-0.58
POSITIVE LOGITS
warts
1.06
wart
1.06
raised
0.90
regarding
0.87
voiced
0.86
trolling
0.84
lessly
0.83
concerns
0.79
arising
0.78
expressed
0.74
Activations Density 0.030%