INDEX
Explanations
terms related to deception or dishonesty
keywords related to legal issues and financial activities
New Auto-Interp
Negative Logits
atchewan
-0.73
Amazing
-0.70
anse
-0.69
ansas
-0.66
orpor
-0.66
ataka
-0.66
Tokens
-0.66
Appropri
-0.65
Alz
-0.65
Nebula
-0.65
POSITIVE LOGITS
fals
0.91
esty
0.88
refin
0.87
inished
0.81
ailed
0.78
ibel
0.76
inance
0.74
ysis
0.72
ILS
0.72
owler
0.71
Activations Density 0.035%