Hello, I need to start all segments with upper case. Just the first word of each segment. Is there a regular expression one can use? Thanks in advance!
Hello, I need to start all segments with upper case. Just the first word of each segment. Is there a regular expression one can use? Thanks in advance!
If you need to filter the segments that start with a small case letter, and show the ones that don't match your requirement, this regex can help you do that:
(?-i)^[a-z]
If you need something else, you'll need to clarify your need.
Hi Olaf Tonn,
I got your private message after you couldn't reply my post.
I'm afraid you can't find and replace the case in Studio, because as far as I know, .Net (the regex engine behind Studio), won't allow you to replace the case (other regex engines, like PERL, can handle case conversions). Maybe a regex guru like Anthony Rudd can confirm or not this point.
Anyway, I couldn't make this simple regex ^[a-z] work in my Studio 2019 (it matches the 2nd, 3rd… lower case characters as well), so you won't be able to even find the lower case characters at the start of the segment:
I guess this is wrong, ?
So I'm afraid, filtering is the way to go.
It is true that .NET can only substitute either a "constant" or the content of captured groups.
^[a-z] matches a lowercase character at the start of a segment. What do you want to match?
Olaf needs to find a lower case letter at the beginning of a segment in order to replace it by its upper case counterpart.
As I mentioned, such a substitution is not possible with native .NET (and so not possible with Studio).
You could do that in Word I think. A long-run workaround, but could work.
Export for external review. Copy the target column in a new file. Convert it to text, separated by paragraph marks. Then search for
^13[a-z]
using wildcards and replace with
^p^&
formatted as Capitals.
After that convert the text to table and copy back to your previously exported file.
you won't be able to even find the lower case characters at the start of the segment
Try ^[a-z]+ with Case Sensitive selected.
Try ^[a-z]+ with Case Sensitive selected.
This works only for Latin - will not catch Polish diacritics. ^\p{Ll} should catch these too.
^\p{Ll} should catch these too.
It won't work with Ctrl+F. To include diacritics, just replace z with ž: ^[a-ž]+ (with Case Sensitive selected)
Oh, \p{Ll} works just fine with Search, tested in this very moment.