INDEX
Explanations
identifying false information
New Auto-Interp
Negative Logits
Premiership
0.79
Mohawk
0.72
lease
0.72
rams
0.70
sumo
0.69
leasing
0.67
lessor
0.67
leases
0.65
Java
0.65
servlet
0.64
POSITIVE LOGITS
debunk
1.33
falsehood
1.28
Fake
1.20
inaccuracy
1.18
disinformation
1.17
misinformation
1.15
fake
1.15
Truth
1.15
veracity
1.11
Verification
1.11
Activations Density 0.549%