INDEX
Explanations
non-English words or phrases
specific terms related to incidents involving faxes and certain individuals' names
New Auto-Interp
Negative Logits
é¾
-0.73
Improvement
-0.69
©¶æ
-0.69
Policy
-0.67
shaw
-0.67
Grove
-0.66
Clicker
-0.66
temples
-0.64
externalToEVAOnly
-0.64
¸
-0.62
POSITIVE LOGITS
isin
0.94
elin
0.79
FIA
0.77
enegger
0.75
hem
0.74
gerald
0.73
ional
0.72
DOS
0.70
itri
0.70
ANA
0.70
Activations Density 0.021%