INDEX
Explanations
occurrences of the word "caught" and related actions indicating detection or observation
New Auto-Interp
Negative Logits
تحص
-0.61
bly
-0.55
bye
-0.55
GenerationType
-0.55
Wra
-0.54
+":
-0.53
jdt
-0.53
Rost
-0.52
osene
-0.51
"}")
-0.51
POSITIVE LOGITS
discovered
1.71
detected
1.60
spotted
1.59
noticed
1.46
discovered
1.44
Discovered
1.42
Detected
1.37
encountered
1.33
observed
1.26
Spotted
1.22
Activations Density 0.255%