After learning more about the GitHub Copilot Agent at the 2026 Microsoft MVP Summit, I decided to finally give it a serious try. This was my first time using the agent on GitHub since I normally rely on Copilot Chat in Visual Studio.

To be honest, my experience with Copilot in Visual Studio has been disappointing. Too often, it generates unit tests that do not compile, do not follow my established patterns, or simply miss the mark. Still, I wanted to give the GitHub agent a fair shot.

Because I am always playing catch-up on unit tests in my open-source Spargine libraries, I decided to use the agent to find and add missing tests across the projects. I hoped it would speed up a task that is both important and time-consuming.

This article covers what I found: the serious issues, the surprising successes, and the lessons I learned along the way.

I started with a simple prompt:

Review the unit tests for CacheStatistics.cs. Ensure all code paths are covered. If not, add the required unit tests.

That is not the prompt I ended up with.

Why I Tried the GitHub Copilot Agent

For the Spargine solution, my plan was simple: let the agent write tests, synchronize the changes locally, confirm the solution built successfully, and then run the unit tests. I repeated that process class by class.

It did not take long for the problems to show up.

The Problems I Ran Into

It Deleted Three Test Projects

The first major issue was a bad one.

After synchronizing changes from GitHub and attempting a build, the solution exploded with hundreds of errors. When I looked more closely, I could hardly believe what I saw: every source file from three unit test projects had disappeared from my local file system, leaving behind only the project files. Fortunately, the files were still in the GitHub repository.

When I went back and reviewed the changes I had approved, I saw that the agent had renamed project files. Somehow, when I synchronized everything locally, the original projects and files were effectively wiped out.

I should have caught it earlier, but honestly, I had no idea what those changes meant.

I panicked. I am not a GitHub expert, and frankly, I do not think developers should need to be GitHub experts just to recover from this kind of mess. Thankfully, the Copilot desktop app helped me reset everything back to the repository version.

Then, the next morning, the agent did it again.

This time, I caught it before merging and canceled the pull request.

To stop it from happening again, I added this line to my prompt:

Do not modify any csproj files.

So far, that has prevented a repeat of the problem. But I am still dumbfounded that the agent did this in the first place.

Odd Code Generation Choices

In one case, the agent created a test Enum and used fully qualified type names for attributes such as DescriptionAttribute. That is not standard coding practice, and it made the generated code look clumsy and inconsistent.

That’s when I added this line to the prompt:

You MUST follow all code-formatting and naming conventions defined in .editorconfig.

Phantom Code Changes

At another point, I saw the agent removing code that did not exist anywhere in Spargine.

That raised an obvious question: what exactly was it deleting?

Was it some temporary code the agent created to run tests? If so, why? There is no good reason for it to inject mystery code when it can simply run the test project.

Thankfully, after synchronizing the code, no rogue internal static class Program ended up in Spargine. But seeing unexplained changes like that does not inspire confidence.

Performance Is Far Too Slow

The amount of time the agent spends generating unit tests ranges from slow to painfully slow.

Sometimes it takes several minutes. Other times it takes an hour. I have even seen it take three hours.

Yes, three hours.

That is unacceptable. Even one hour is far too long for a workflow like this. I sincerely hope the GitHub team is actively working on performance, because in its current state the delay is hard to justify.

Review Feedback Does Not Work the Way You’d Expect

In a normal pull request workflow, when I find an issue in code review, I request changes and send it back to the developer. That is how developers improve, and it keeps the review cycle efficient.

That is not how this works with the agent.

When the agent makes a bad change and I request changes, it does not automatically pick that up and fix the issue. I learned that I must explicitly start the review comment with @copilot for the agent to respond.

That is not intuitive, and it is not how most developers would expect the review flow to work.

It Still Misses Code Paths

My original prompt explicitly said this:

Ensure all code paths are covered. If not, add the required unit tests.

You would think that instruction would be clear enough.

It was not.

After using the agent on the DotNetTips.Spargine.Core assembly, I ran Analyze Code Coverage in Visual Studio and found that it still missed many code paths.

After using the agent across the entire Spargine solution, here were the results:

File Name	Covered (Blocks)	Not Covered (Blocks)	Covered (Lines)	Partially Covered (Lines)	Not Covered (Lines)
dotnettips.spargine.10.dll	1,258	419	987	36	317
dotnettips.spargine.10.extensions.dll	4,220	278	3,095	40	177
dotnettips.spargine.10.tester.dll	2,926	302	2,124	36	202
dotnettips.spargine.10.core.dll	8,978	1,213	6,214	187	667
Total	17,382	2,212	12,420	299	1,363

That is disappointing.

Back in the .NET Framework days, we had IntelliTest, which could elegantly generate tests to cover code paths. I was deeply disappointed when that capability did not carry over into modern .NET. I spent two years working with the IntelliTest PM to help make that feature more useful for developers, so I know firsthand how valuable that kind of tooling can be.

If you have read my articles or my coding standards book, you know I talk a lot about cyclomatic complexity. To me, cyclomatic complexity represents the minimum number of unit tests needed for a method or property just to validate encapsulation properly.

I had hoped the agent would do better here.

I Gave the Agent a Second Chance. It Still Missed Code Paths

I was disappointed that the agent did not cover all code paths, so I decided to give it another chance.

For the first class, coverage started at 81.4% of blocks covered. I asked the agent to add the missing tests needed to cover all code paths, but the result stayed at 81.4%. I tried again, and it still remained at 81.4%. At that point, I started to wonder whether a recent Visual Studio update had affected code coverage analysis, especially since the agent was adding new tests each time.

To rule that out, I tried a second class that started at 82.2% covered blocks. After the next attempt, coverage increased to 88.5%. That was an improvement, but it still fell short of 100%. I tried again, and once again the result stayed at 88.5%.

At this point, the pattern was clear: the agent does not appear to generate tests that cover all code paths the way IntelliTests once did in the .NET Framework.

Tests That Passed on GitHub Failed Locally

When I reviewed the generated tests locally, I found 18 failing tests.

In one example, a test expected TaskCanceledException, but the actual exception was UriFormatException. In another, the agent created a test that tried to start /bin/echo, which may work in one environment but does not work on a local Windows machine.

That means the agent produced tests that passed on GitHub but failed locally.

That is a serious problem.

Because of this, I added another instruction to my prompt:

Make sure all unit tests work on GitHub and a local Windows machine.

Other Quality Issues I Found

Some of these problems happened even after I told the agent to follow my EditorConfig settings:

It created unnecessary casts.
It created unnecessary variable assignments.
It did not consistently use braces for if statements.
It did not use discards correctly.
It used var inconsistently.
It failed to use better assertions, such as Assert.HasCount, Assert.IsFalse, and Assert.IsLessThanOrEqualTo where appropriate.
It preferred constructors when TestInitialize would have been the better choice.
Its end-of-session summaries were inconsistent and often lacked useful detail.
It did not always show the Create Pull Request option after making changes, which appears to be a refresh issue.
It did not properly dispose of disposable objects. This is a big one.
It did not use file-scoped namespaces.
It used underscores in method names, ignoring my conventions.
It did not properly add [ExcludeFromCodeCoverage] to test classes.

What the Agent Did Well

To be fair, it was not all bad.

Here are a few things the agent did reasonably well:

In the Spargine Tester project, there are multiple files with the same name. When I used a prompt like this, the agent handled that correctly:
Review the unit tests for PersonComparerByLastName.cs files.
Much of the generated code looked pretty good overall, even if formatting was not always consistent.
This experiment added hundreds of new unit tests to Spargine.
From what I have reviewed so far, many of those tests are solid.
It correctly formatted class header information in the standard Spargine style.

So yes, the agent did provide real value. It just also created enough friction that I had to keep tightening the process.

The Prompt That Worked Better

After running into all of these issues, this is the prompt I ended up using:

Review the unit tests for CacheStatistics.cs. Ensure all code paths are covered. If not, add the required unit tests. Make sure all unit tests work on GitHub and a local Windows machine. Have the unit test class inherit from UnitTester, but only if it would add value. If possible, create test data using the RandomData class. Test methods should not include XML documentation. Change the UnitTestStatus for any method with full coverage to Completed. Do not modify any csproj files. You MUST follow all code-formatting and naming conventions defined in .editorconfig. DO NOT use underscores in method names. Don't create unnecessary casts. Make sure to update the Last Modified On to "Copilot Agent" and update Last Modified By in the file header of all files that were changed.

That is a long prompt.

To simplify it, I moved most of those instructions into a copilot-instructions.md instruction file. At first, the agent did not seem to pick it up reliably, but after I explicitly called out what it was doing wrong, it started following the file more consistently.

Now my working prompt is much simpler:

Review the unit tests for CacheStatistics.cs. Add any missing unit tests.

You can view the instruction file here: https://github.com/RealDotNetDave/dotNetTips.Spargine.10/blob/master/.github/copilot-instructions.md

Final Thoughts

After using the GitHub Copilot Agent across all the Spargine projects, I came away with mixed feelings.

It writes better unit tests than Copilot Chat in Visual Studio. That alone made the experiment worthwhile. It also helped add a substantial number of tests across the solution, which saved real time.

But it still has a long way to go.

Its biggest weaknesses are reliability, code-path coverage, inconsistent adherence to coding conventions, and the fact that it can generate tests that behave differently on GitHub and on a local Windows machine. On top of that, performance is simply too slow for a tool that is supposed to accelerate development.

Even so, this experiment was useful. It forced me to refine my prompts, tighten my workflow, and think more clearly about what I expect from AI-assisted testing tools.

Here is the most important takeaway: the GitHub Copilot Agent can help, but it still needs close supervision. It is not ready to be trusted on its own, especially for something as important as test quality and code coverage.

Below is the number of unit tests in Spargine before and after using the agent:

File	Before the Agent	After the Agent
dotnettips.spargine.10.dll	175	336
dotnettips.spargine.10.extensions.dll	1,386	1,925
dotnettips.spargine.10.tester.dll	726	790
dotnettips.spargine.10.core.dll	2,127	2,708

That is real progress.

The problem is that progress came with too much cleanup, too much rework, and too many avoidable mistakes.

Before you jump in with the GitHub Copilot Agent for unit testing, go in with your eyes open, keep a close watch on every change, and lock down your instructions early. You may get useful output from it, but you will almost certainly still need to be the adult in the room.

I highly recommend reading this article before getting started on what the .NET runtime team found using the Copilot agent: https://devblogs.microsoft.com/dotnet/ten-months-with-cca-in-dotnet-runtime/#the-redmond-flight-experiment

Unit Tests: Keep Your CRAP Score from Wrecking the Show

Pick up any books by David McCarter by going to Amazon.com: http://bit.ly/RockYourCodeBooks

Make a one-time donation

Make a monthly donation

Make a yearly donation

Choose an amount

$5.00

$15.00

$100.00

$5.00

$15.00

$100.00

$5.00

$15.00

$100.00

Or enter a custom amount

Your contribution is appreciated.

Donate

Donate monthly

Donate yearly

If you liked this article, please buy David a cup of Coffee by going here: https://www.buymeacoffee.com/dotnetdave

© The information in this article is copywritten and cannot be reproduced in any way without express permission from David McCarter.

Discover more from dotNetTips.com

Subscribe to get the latest posts sent to your email.

GitHub Copilot Agent for Unit Tests: My Real-World Spargine Experiment

Why I Tried the GitHub Copilot Agent

The Problems I Ran Into

It Deleted Three Test Projects

Odd Code Generation Choices

Phantom Code Changes

Performance Is Far Too Slow

Review Feedback Does Not Work the Way You’d Expect

It Still Misses Code Paths

I Gave the Agent a Second Chance. It Still Missed Code Paths

Tests That Passed on GitHub Failed Locally

Other Quality Issues I Found

What the Agent Did Well

The Prompt That Worked Better

Final Thoughts

Make a one-time donation

Make a monthly donation

Make a yearly donation

Like this:

Related

Discover more from dotNetTips.com

Published by David (dotNetDave) McCarter

Leave a ReplyCancel reply

Why I Tried the GitHub Copilot Agent

The Problems I Ran Into

It Deleted Three Test Projects

Odd Code Generation Choices

Phantom Code Changes

Performance Is Far Too Slow

Review Feedback Does Not Work the Way You’d Expect

It Still Misses Code Paths

I Gave the Agent a Second Chance. It Still Missed Code Paths

Tests That Passed on GitHub Failed Locally

Other Quality Issues I Found

What the Agent Did Well

The Prompt That Worked Better

Final Thoughts

Make a one-time donation

Make a monthly donation

Make a yearly donation

Please Share This:

Like this:

Related

Discover more from dotNetTips.com

Published by David (dotNetDave) McCarter

Leave a ReplyCancel reply

Discover more from dotNetTips.com