INDEX
Explanations
instances that signify irony or unexpected outcomes
New Auto-Interp
Negative Logits
bootstrapcdn
-0.57
ArgsConstructor
-0.57
OGS
-0.57
thâu
-0.56
PerformLayout
-0.56
новниш
-0.54
Marius
-0.53
Himo
-0.52
parms
-0.52
intensities
-0.51
POSITIVE LOGITS
Ironically
0.68
ironically
0.65
brigens
0.63
xically
0.59
fact
0.59
Interestingly
0.58
faktisk
0.57
何と
0.57
なんと
0.56
Кстати
0.56
Activations Density 0.247%