INDEX
Explanations
phrases related to news articles or reports
references to specific organizations or entities, particularly those represented by abbreviations or acronyms
New Auto-Interp
Negative Logits
enegger
-0.89
baugh
-0.78
ebted
-0.78
shaw
-0.77
ptin
-0.70
sidx
-0.67
anamo
-0.67
rador
-0.65
praise
-0.64
redients
-0.63
POSITIVE LOGITS
DF
1.06
TP
1.00
LC
0.99
Vs
0.98
RP
0.97
TC
0.97
FK
0.97
ZE
0.94
DS
0.94
AC
0.93
Activations Density 0.102%