Use Wildcard Search to Find Word Replacement Character Issues

Have you ever opened a document in Microsoft Word and seen strange symbols? By strange, I mean black diamonds with a white question mark. Or, the dotted box with OBJ in the middle. In this tutorial, I’ll show how to use wildcard search in Word’s Advanced Find feature to highlight and fix replacement characters.

What are Replacement Characters?

If you look at the image below, you’ll see the two offending items. Offending may be too harsh a word, so the term “replacement” is better. These characters occur when Microsoft Word can’t translate something. It’s usually the result of a document being created in another application, missing fonts, or maybe it was scanned. Lots of reasons and lots of mysteries.

When Word encounters a character it can’t display, it will replace that character with �. This is our clue we need to fix something. And a similar issue happens with embedded objects like images Object replacement character.. These symbols aren’t unique to Word; you’ll see them in many applications ranging from email to spreadsheets like Excel. In fact, they both have their own UNICODE.

Word paragraph with replacement character and object replacement character.
Example of Replacement Character and Object Replacement Character

Finding Replacement Characters

One problem with these characters is you can’t paste them into Word’s Find feature and get correct results. It could also contribute to Word spell check not working as expected. In the image below, you’ll see I can paste in the black diamond question mark image, but Word wants to also highlight spaces.

Word Find showing 676 results and highlighting spaces.
Word highlights spaces too

Advanced Find and Unicodes

As I mentioned earlier, each of the replacement characters has a Unicode. I thought I could use the uFFFd code in combination with a wildcard search. Word objects and indicates the ^ is not a valid special character for the Find What box. However, the ^ character is valid for the Replace box.

Find error message indicating I can't use special charachters.
Can’t search for black diamond by Unicode

Redefining the Search Group

Sometimes you need a Plan B. Although Word’s Find feature didn’t allow me to search for that Unicode question mark symbol or OBJ box, I could redefine the problem. One feature of using wildcard syntax is you can define what to exclude.

I did this using a set of broad rules based on my document. My intent was to eliminate everything I wasn’t expecting or what I call “cruft”. I created a group of items that excluded:

  • a-z: lowercase letters
  • A-Z: Uppercase letters
  • 0-9: Numbers
  • a space character
  • various punctuation items such as commas, single and double quotes, periods, hyphens, colons, semicolons, question marks, and backslashes

And my wildcard syntax looked like the string below. Don’t worry, I’ll explain below.

([!a-zA-Z0-9 ",'’“”.-:;\-\?\\]){1,}

Understanding the Wildcard Syntax

The syntax may look familiar if you know regular expressions (RegEx). On the other hand, if RegEx doesn’t mean anything, your eyes may have glazed over. Just remember my broad rules above.

In some regards, these expressions are like some of our primary school math lessons where the parentheses and brackets provide order and groupings. In our case, the group is defined by the [ ].

If we look back at my broad rules above, we can do some pattern matching.

My Broad RulesWildcard syntax
lowercase lettersa-z
uppercase lettersA-Z
spaces (I used my spacebar)
punctuation‘’“”.-:;-\?\\ (Yes, I have some curly quote marks)

There are still some nuances about the syntax, which I’ll explain.

  1. Brackets surround our search group [ ].
  2. Word’s wildcard search is case-sensitive, so we have both upper and lower case letters.
  3. The ! sign at the group beginning means Not. This is how I exclude valid characters.
  4. A plain space represents the space.
  5. Some of the punctuation items from my document are preceded with a \ character which escapes the key. For example, the ? is also used to represent any character.
  6. The expression is wrapped in parentheses ( ).

The last portion, {1,} tells Microsoft Word to find one or more consecutive characters that do not belong to the group. And by group, we mean the replacement character � and object replacement character Object replacement character..

Our Expression in Word Advanced Find

We need to use Word’s Advanced Find to locate these special characters. The goal is to have Word highlight the items it finds in yellow so it’s easier to fix the document.

  1. From the Word ribbon, click the Home tab.
  2. In the Editing Group, click the Replace button.
  3. In the Find and Replace dialog box, click the Find tab.
  4. Click the More >> button.
  5. Type or copy the full syntax string above into the Find what: textbox.
  6. In the Search Options section, check Use wildcards.
  7. In the Reading Highlight drop-down, select Highlight All.

And if your Word document has these characters, you should see something like the image below.

Word highlighted in yellow the replacement characters.
Word highlights replacement characters in yellow

Tweaking the Search Expression

As you can see, Word found my problem characters. However, it also flagged some other items. The syntax I used was enough for me to triage the document. However, if I were a perfectionist, I could go back and add other characters it highlighted. These were mostly other punctuation symbols I used and other keys like the + key. However, it did find another problem character – □.

While the Find tab does restrict in terms of using the ^, there are still plenty of options. If you click the Special button in the Find section, you’ll see a menu with predefined options like em dash, tabs, graphics, etc. In addition, you can see some of the codes I used. This is a great way to learn the syntax.

Once you find your perfect find syntax, you could use a program like TextExpander or ActiveWords to save it so you don’t have to recreate it.

Key Points & Takeaways

  • Replacement characters occur when Microsoft Word can’t translate something. This could be due to the document being created in another application, missing fonts, or the document being scanned.
  • When Word encounters a character it can’t display, it replaces that character with �. A similar issue happens with embedded objects like images.
  • One problem with these characters is that you can’t paste them into Word’s Find feature and get the correct results.
  • Each replacement character has a Unicode.
  • Word objects to using the ^ character in the Find What box. However, the ^ character is valid for the Replace box.