Introduction to Network Trace Analysis 5: SMB? Sounds good to me!

  • Thread starter Thread starter WillAftring
  • Start date Start date
W

WillAftring

Howdy everyone, it’s your favorite Software Engineer, Will, back again talking about the Server Message Block (SMB) protocol!



Why talk about SMB?

Let's start off with the question, what is this whole SMB thing anyway? SMB is a network file system protocol. This means that it can allow Machine A to read and write files on Machine B. This protocol serves as the backbone of much of the Enterprise Windows Ecosystem. For example, did you know that the group policy SYSVOL is an SMB share? Pretty cool right?

In recent history, there have been  tons  of improvements to SMB. For the sake of understanding the protocol we will not be talking about things like:


But, we may touch on these in a later blog post:


What I would like to hammer home is that there is a large amount of existing Microsoft content about SMB. Since those articles were written, there has been a ton of work done on the SMB PowerShell Cmdlets. If you ever need to make ANY changes to SMB, the recommendation is to use either policy or the SMB Cmdlets instead of directly interfacing with the Windows Registry.



Client Cmdlets: Set-SmbClientConfiguration (SmbShare) | Microsoft Learn

Server Cmdlets: Set-SmbServerConfiguration (SmbShare) | Microsoft Learn



Protocol Overview

The SMB protocol is a call and response protocol. It operates over TCP port 445, by default. Versions of Windows released in the Fall of 2024 and later allow alternative SMB ports.

The SMB client makes a request, and the server responds to that request. The start of every SMB connection follows an identical pattern.

The flow of a new SMB connection is as follows:

  • SMB Dialect Negotiation
    • What language do we speak?
    • SMB 1.0 (deprecated)
    • SMB 2.0
    • SMB 3.0
  • SMB Capability Negotiation
    • What can we both do?
    • SMB Signing
    • SMB Encryption
    • etc...
  • User Authentication (Session Setup)
    • Who are you?
    • NTLM
    • Kerberos
  • Tree Connect
    • What is the base of the point of connection (i.e. share name)?

Everything after this is up to the client to ask for. We will give some examples of what the client can do later.



Before we do that let's walk through what this might look like in a packet capture.

I have a capture of a client connecting to the share  \\MB01\ShareName .



Here is what that looks like using Wireshark:



// Here is the TCP 3-way handshake
47 16:33:42.007501 172.16.1.17 172.16.1.18 TCP 66 64240 49810 → 445 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM
48 16:33:42.007811 172.16.1.18 172.16.1.17 TCP 66 65535 445 → 49810 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256 SACK_PERM
49 16:33:42.007915 172.16.1.17 172.16.1.18 TCP 54 262656 49810 → 445 [ACK] Seq=1 Ack=1 Win=262656 Len=0

// The initial SMB protocol negotiation
50 16:33:42.007954 172.16.1.17 172.16.1.18 SMB 127 262656 Negotiate Protocol Request
51 16:33:42.008457 172.16.1.18 172.16.1.17 SMB2 306 2097920 Negotiate Protocol Response
52 16:33:42.008505 172.16.1.17 172.16.1.18 SMB2 318 262400 Negotiate Protocol Request
53 16:33:42.008897 172.16.1.18 172.16.1.17 SMB2 430 2097664 Negotiate Protocol Response

// Authentication happens in these two frame
64 16:33:42.016084 172.16.1.17 172.16.1.18 SMB2 1883 262144 Session Setup Request
66 16:33:42.016726 172.16.1.18 172.16.1.17 SMB2 314 2097920 Session Setup Response

// And, finally, connect to the share
73 16:33:42.018224 172.16.1.17 172.16.1.18 SMB2 162 262656 Tree Connect Request Tree: \\MB01\ShareName
74 16:33:42.018468 172.16.1.18 172.16.1.17 SMB2 138 2097408 Tree Connect Response


See? Not so bad. But wait, there’s more!



The responses to the setup are then used in the SMB header going forward to provide context to the connection. For example, here is the session setup request and response:



64 13.581207 172.16.1.17 172.16.1.18 SMB2 1883 Session Setup Request
Frame 64: 1883 bytes on wire (15064 bits), 1883 bytes captured (15064 bits) on interface \Device\NPF_{7263DA0A-0F05-4542-84C9-33E17CEDC31C}, id 0
Ethernet II, Src: Microsoft_01:2b:07 (00:15:5d:01:2b:07), Dst: Microsoft_01:2b:08 (00:15:5d:01:2b:08)
Internet Protocol Version 4, Src: 172.16.1.17, Dst: 172.16.1.18
Transmission Control Protocol, Src Port: 49810, Dst Port: 445, Seq: 338, Ack: 629, Len: 1829
NetBIOS Session Service
SMB2 (Server Message Block Protocol version 2)
SMB2 Header
ProtocolId: 0xfe534d42
Header Length: 64
Credit Charge: 1
Channel Sequence: 0
Reserved: 0000
Command: Session Setup (1)
Credits requested: 33
Flags: 0x00000010, Priority
Chain Offset: 0x00000000
Message ID: 2
Process Id: 0x0000feff
Tree Id: 0x00000000
Session Id: 0x0000000000000000
Signature: 00000000000000000000000000000000
[Response in: 66]
Session Setup Request (0x01)

66 13.581849 172.16.1.18 172.16.1.17 SMB2 314 Session Setup Response
SMB2 (Server Message Block Protocol version 2)
SMB2 Header
ProtocolId: 0xfe534d42
Header Length: 64
Credit Charge: 1
NT Status: STATUS_SUCCESS (0x00000000)
Command: Session Setup (1)
Credits granted: 33
Flags: 0x00000019, Response, Signing, Priority
Chain Offset: 0x00000000
Message ID: 2
Process Id: 0x0000feff
Tree Id: 0x00000000
Session Id: 0x0000080000000009
[Authenticated in Frame: 66]
Signature: 70db969049fb94d444eaf0bbad0e70de
[Response to: 64]
[Time from request: 0.000642000 seconds]
Session Setup Response (0x01)


And in all subsequent requests within this session will use this session id. In this case 0x0000080000000009.



Here is the Tree Connect request header:



73 13.583347 172.16.1.17 172.16.1.18 SMB2 162 Tree Connect Request Tree: \\MB01\ShareName
Frame 73: 162 bytes on wire (1296 bits), 162 bytes captured (1296 bits) on interface \Device\NPF_{7263DA0A-0F05-4542-84C9-33E17CEDC31C}, id 0
Ethernet II, Src: Microsoft_01:2b:07 (00:15:5d:01:2b:07), Dst: Microsoft_01:2b:08 (00:15:5d:01:2b:08)
Internet Protocol Version 4, Src: 172.16.1.17, Dst: 172.16.1.18
Transmission Control Protocol, Src Port: 49810, Dst Port: 445, Seq: 2547, Ack: 1469, Len: 108
NetBIOS Session Service
SMB2 (Server Message Block Protocol version 2)
SMB2 Header
ProtocolId: 0xfe534d42
Header Length: 64
Credit Charge: 1
Channel Sequence: 0
Reserved: 0000
Command: Tree Connect (3)
Credits requested: 1
Flags: 0x00000018, Signing, Priority
Chain Offset: 0x00000000
Message ID: 6
Process Id: 0x0000feff
Tree Id: 0x00000000
Session Id: 0x0000080000000009 // Here is the session id from the session setup
[Authenticated in Frame: 66]
Signature: 5505a3840f07c5d284e736e521ff13e7
[Response in: 74]
Tree Connect Request (0x03)


This holds true for the tree id as well.



74 13.583591 172.16.1.18 172.16.1.17 SMB2 138 Tree Connect Response
Frame 74: 138 bytes on wire (1104 bits), 138 bytes captured (1104 bits) on interface \Device\NPF_{7263DA0A-0F05-4542-84C9-33E17CEDC31C}, id 0
Ethernet II, Src: Microsoft_01:2b:08 (00:15:5d:01:2b:08), Dst: Microsoft_01:2b:07 (00:15:5d:01:2b:07)
Internet Protocol Version 4, Src: 172.16.1.18, Dst: 172.16.1.17
Transmission Control Protocol, Src Port: 445, Dst Port: 49810, Seq: 1469, Ack: 2655, Len: 84
NetBIOS Session Service
SMB2 (Server Message Block Protocol version 2)
SMB2 Header
ProtocolId: 0xfe534d42
Header Length: 64
Credit Charge: 1
NT Status: STATUS_SUCCESS (0x00000000)
Command: Tree Connect (3)
Credits granted: 1
Flags: 0x00000019, Response, Signing, Priority
Chain Offset: 0x00000000
Message ID: 6
Process Id: 0x0000feff
Tree Id: 0x00000005 \\MB01\ShareName
Session Id: 0x0000080000000009
[Authenticated in Frame: 66]
Signature: b85d42847555b1f0a85d775fd8b94d57
[Response to: 73]
[Time from request: 0.000244000 seconds]
Tree Connect Response (0x03)


All operations that are acting on the tree ( \\MB01\ShareName ) will set their Tree Id field to 0x5. Pretty cool right?



Before we get into the different scenarios, I want to take a quick detour.

DON'T USE SMB1!​


I won't spend much time here since there are much better resources than myself on this but please stop using SMB 1.

Now, let's get into the scenarios.

Scenarios​




Oops ! No shares.​


You have a member server that you use for storage. The member server has two shares, development and production.

You come in bright and early on Monday to a ticket stating, "I can't access the production share!", and with that, let's jump into it.

Your opening questions:

  • Q: When did this first start?
    • A: I don't know. I saw it when I came in two hours ago.
  • Q: What changed?
    • A: Nothing!
  • Q: What is the server’s name?
    • A: I don't know! I have a mapped drive that isn't working!
  • Q: Is the development share working?
    • A: Yes, but I don't care about that. I need the production share!

Not the most helpful but should be enough for us to get going. Let's start by getting a two-sided packet capture while reproducing the issue.



Looking at the mapped share, something is clearly wrong:

large?v=v2&px=999.png







And when we double click the production share, we get the following error:

large?v=v2&px=999.png







(Side note: If you hit Ctrl+C on the error window, it will copy the contents to your clipboard see below)

---------------------------
Restoring Network Connections
---------------------------
An error occurred while reconnecting Y: to
\\MB01.contoso.com\production
Microsoft Windows Network: The local device name is already in use.


This connection has not been restored.

---------------------------
OK
---------------------------


But we captured a two-sided trace so let's start on the client side. As mentioned earlier, SMB takes place over TCP port 445 so we will be using the filter  tcp.port == 445 . This is what we can see:

49 1.506542 172.16.1.17 172.16.1.18 SMB2 188 Tree Connect Request Tree: \\MB01.contoso.com\production
50 1.508804 172.16.1.18 172.16.1.17 SMB2 130 Tree Connect Response, Error: STATUS_BAD_NETWORK_NAME
51 1.509004 172.16.1.17 172.16.1.18 SMB2 188 Tree Connect Request Tree: \\MB01.contoso.com\production
52 1.512821 172.16.1.18 172.16.1.17 SMB2 130 Tree Connect Response, Error: STATUS_BAD_NETWORK_NAME


Wait... Where is the rest of the SMB connection? Well, SMB uses connection pooling. Meaning, if there is already an open connection to the SMB server, we will use that existing connection. Given that there are two mapped shares to this server (the other being development) this existing connection makes sense. And to confirm the state of the mappings, we can use the Get -SmbMapping  PowerShell cmdlet:

PS C:\> Get-SmbMapping

Status Local Path Remote Path
------ ---------- -----------
Disconnected Y: \\MB01.contoso.com\production
OK Z: \\MB01.contoso.com\development


This mirrors what we expected so we are good on that front.



To help keep lines of communication clear, the SMB header fields call out which session and tree you are operating on via the Tree Id and Session Id fields of the SMB header.



Regardless, we have a few things we know for sure.

  1. We are proceeding with the SMB Tree Connect
  2. We know the SMB protocol negotiation was good.
  3. We know the SMB session setup was good.

Given this, the problem seems to be unique to the SMB tree connect. The exact path we are trying to access is \\MB01.contoso.com\production , and the response we are getting from the server is NT Status: STATUS_BAD_NETWORK_NAME (0xc00000cc) . That seems like a specific error, what does the protocol specification say about this error?



... The server MUST use <normalized hostname, sharename> to look up the Share in ShareList. If no share with a matching share name and server name is found, the server MUST fail the request with STATUS_BAD_NETWORK_NAME.


Source:  3.3.5.7 Receiving an SMB2 TREE_CONNECT Request



That seems pretty straight forward. It seems like the share wasn't found. But why? Well, let's do our due diligence on the server. We are going to confirm the status of the SMB shares on the server by running the Get -SmbShare  PowerShell cmdlet.

PS C:\> Get-SmbShare

Name ScopeName Path Description
---- --------- ---- -----------
ADMIN$ * C:\Windows Remote Admin
C$ * C:\ Default share
development * C:\Shares\development
IPC$ * Remote IPC


We see development, but we don't see production. With this, I think it's time to chat with the server owner.

  • Q: Howdy Ms. ServerOwner, where is the production share kept on disk?
    • A: It's  C:\Shares\production
  • Q: Can you think of any reason this share might not be there?
    • A: We had some concerns about a security incident this past weekend and we stopped sharing all folders. But it should be reshared as of this morning.

Let's trust but verify. Going onto the server, navigating to the folder in question and checking the sharing properties, we can see this:

large?v=v2&px=999.png





Looks like it isn't shared. If we click, share and attempt our test again? Everything looks good.

Problem solved.

Can't access the share!​


You are trying to finish a video project for your client. You have collected all the necessary shots and now you go home and want to move the files onto your more powerful workstation to handle the video rendering.



You set up an SMB share on the workstation and try to connect. And... nothing. The connection fails.



Being the networking rock star you are, you think through a few questions:

  • Is the SMB port listening?


    PS C:\> netstat -ano | Select-string 445
    TCP 0.0.0.0:445 0.0.0.0:0 LISTENING 4
    TCP [::]:445 [::]:0 LISTENING 4

Yep!



  • Can I make a TCP connection via port 445?


    PS C:\> Test-NetConnection workstation.contoso.com -CommonTCPPort SMB

    ComputerName : Workstation.contoso.com
    RemoteAddress : 192.168.1.47
    RemotePort : 445
    InterfaceAlias : Ethernet
    SourceAddress : 192.168.1.100
    TcpTestSucceeded : False




Looks like a no.


Next you collect a two-sided packet capture. And this is what you can see:

1 0.000000 192.168.1.100 192.168.1.47 TCP 66 50540 → 445 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM
2 1.001224 192.168.1.100 192.168.1.47 TCP 66 [TCP Retransmission] 50540 → 445 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM
7 3.002066 192.168.1.100 192.168.1.47 TCP 66 [TCP Retransmission] 50540 → 445 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM
12 7.003256 192.168.1.100 192.168.1.47 TCP 66 [TCP Retransmission] 50540 → 445 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM
15 15.004147 192.168.1.100 192.168.1.47 TCP 66 [TCP Retransmission] 50540 → 445 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM

// And on the other side you see:
1 0.000000 192.168.1.100 192.168.1.47 TCP 66 50540 → 445 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM
2 1.001224 192.168.1.100 192.168.1.47 TCP 66 [TCP Retransmission] 50540 → 445 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM
7 3.002066 192.168.1.100 192.168.1.47 TCP 66 [TCP Retransmission] 50540 → 445 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM
12 7.003256 192.168.1.100 192.168.1.47 TCP 66 [TCP Retransmission] 50540 → 445 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM
15 15.004147 192.168.1.100 192.168.1.47 TCP 66 [TCP Retransmission] 50540 → 445 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM


This really looks like a basic TCP connectivity issue. But, the next day, you are back in the office and try to do the same thing and you notice it works? What is going on here?



This is going to be the result of the Windows Network Connection Profile. The abridged version of this is, when you are in the office you can  probably  contact a Domain Controller (DC). If you can contact a DC, then your network profile will be set to Domain. Otherwise, unless specified, the network profile will be set to Public.



You can check this by running the PowerShell cmdlet  Get-NetConnectionProfile :

PS C:\> Get-NetConnectionProfile

Name : home.wifi
InterfaceAlias : Ethernet
InterfaceIndex : 14
NetworkCategory : Public
DomainAuthenticationKind : None
IPv4Connectivity : Internet
IPv6Connectivity : LocalNetwork


And the result while in the office:

PS C:\> Get-NetConnectionProfile
Name : contoso.com
InterfaceAlias : Ethernet
InterfaceIndex : 14
NetworkCategory : DomainAuthenticated
DomainAuthenticationKind : Ldap
IPv4Connectivity : Internet
IPv6Connectivity : LocalNetwork


The reason for this behavior is that a public network is treated as untrusted. In this untrusted state, there are much more restrictive set of firewall rules applied which include blocking inbound SMB traffic. For more on the public network profile please see  Windows Firewall Overview - Public Network .



With this in mind once we change our home network to a private profile (either via the Settings App or Set -NetConnectionProfile ). Reattempting the behavior, we look all good!

What is the name?


It's Friday afternoon, you've just treated yourself to some incredible Indian food for lunch and you hear your desk phone ring.

"The backup job for the SQL database isn't working. We've scoped the issue down to SQL can't access the storage server."

Dang. Time to get back to work. Let’s start with some simple questions.

  • Q: When did things start breaking?
    • A: About 20 minutes ago.
  • Q: What changed?
    • A: We haven't touched the server in 6+ months so I have no clue.
  • Q: What is the server’s name?
    • A:  MB01.contoso.com

Let's jump into some testing. Starting with basic TCP connectivity:

PS C:\> Test-NetConnection mb01.contoso.com -CommonTCPPort SMB

ComputerName : mb01.contoso.com
RemoteAddress : 172.16.1.18
RemotePort : 445
InterfaceAlias : Ethernet
SourceAddress : 172.16.1.17
TcpTestSucceeded : True


TCP connectivity looks good. How about SMB?

PS C:\> Get-ChildItem \\mb01.contoso.com\development\

Directory: \\mb01.contoso.com\development

Mode LastWriteTime Length Name
---- ------------- ------ ----
-a---- 5/17/2024 10:35 AM 10000 dev.db


Okay... What's the problem?

Chatting with the database admin, it comes out that the location being used for the backup is  \\data.contoso.com\development\dev.db .

Running our tests again:

PS C:\> Test-NetConnection data.contoso.com -CommonTCPPort SMB

ComputerName : data.contoso.com
RemoteAddress : 172.16.1.18
RemotePort : 445
InterfaceAlias : Ethernet
SourceAddress : 172.16.1.17
TcpTestSucceeded : True


Wait a second... This is the same IP address. What is going on here? Taking a closer look at the DNS resolution:

PS C:\> Resolve-DnsName data.contoso.com

Name Type TTL Section NameHost
---- ---- --- ------- --------
data.contoso.com CNAME 3600 Answer MB01.contoso.com

Name : MB01.contoso.com
QueryType : A
TTL : 1200
Section : Answer
IP4Address : 172.16.1.18


We didn't talk about CNAME records (also called alias records) in the previous blog post about DNS, but they are a pointer to another record. In this case  data.contoso.com  is pointing to  MB01.contoso.com . If that is the case this should work, right? Testing the SMB connection:

PS C:\> Get-ChildItem \\data.contoso.com\development

Get-ChildItem : Access is denied
At line:1 char:1
+ Get-ChildItem \\data.contoso.com\development
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : PermissionDenied: (\\data.contoso.com\development:String) [Get-ChildItem], UnauthorizedAccessException
+ FullyQualifiedErrorId : ItemExistsUnauthorizedAccessError,Microsoft.PowerShell.Commands.GetChildItemCommand

Get-ChildItem : Cannot find path '\\data.contoso.com\development' because it does not exist.
At line:1 char:1
+ Get-ChildItem \\data.contoso.com\development
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : ObjectNotFound: (\\data.contoso.com\development:String) [Get-ChildItem], ItemNotFoundException
+ FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.GetChildItemCommand


That isn't good. But we have an error that we can look into!  PermissionDenied: (\\data.contoso.com\development:String) [Get-ChildItem], UnauthorizedAccessException . It is time that we get a network trace.

Here is what we can see during a reproduction of the behavior:



// Yep this verifies the record is an alias
2 4.754959 172.16.1.17 172.16.1.10 DNS 76 Standard query 0xf322 A data.contoso.com
3 4.757878 172.16.1.10 172.16.1.17 DNS 111 Standard query response 0xf322 A data.contoso.com CNAME MB01.contoso.com A 172.16.1.18

// TCP 3-way handshake looks good
6 4.761897 172.16.1.17 172.16.1.18 TCP 66 49823 → 445 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM
7 4.765902 172.16.1.18 172.16.1.17 TCP 66 445 → 49823 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256 SACK_PERM
8 4.766022 172.16.1.17 172.16.1.18 TCP 54 49823 → 445 [ACK] Seq=1 Ack=1 Win=262656 Len=0

// Protocol negotiation looks good
9 4.766146 172.16.1.17 172.16.1.18 SMB 127 Negotiate Protocol Request
10 4.769899 172.16.1.18 172.16.1.17 SMB2 306 Negotiate Protocol Response
11 4.769983 172.16.1.17 172.16.1.18 SMB2 342 Negotiate Protocol Request
12 4.773888 172.16.1.18 172.16.1.17 SMB2 430 Negotiate Protocol Response

23 4.798726 172.16.1.17 172.16.1.18 SMB2 220 Session Setup Request, NTLMSSP_NEGOTIATE

// This isn't necessarily a problem. It just means we need to go through the NTLM authentication
24 4.800397 172.16.1.18 172.16.1.17 SMB2 365 Session Setup Response, Error: STATUS_MORE_PROCESSING_REQUIRED, NTLMSSP_CHALLENGE
25 4.803255 172.16.1.17 172.16.1.18 SMB2 661 Session Setup Request, NTLMSSP_AUTH, User: CONTOSO\will

// This is a problem...
27 4.828528 172.16.1.18 172.16.1.17 SMB2 130 Session Setup Response, Error: STATUS_ACCESS_DENIED
28 4.828859 172.16.1.17 172.16.1.18 TCP 54 49823 → 445 [RST, ACK] Seq=1135 Ack=1016 Win=0 Len=0


We are getting  STATUS_ACCESS_DENIED  to our request, but the same user authenticating the share via  \\mb01.contoso.com\development  works? Let's look at the working scenario so we can understand the deviation better.

// TCP 3-way handshake looks good
407 52.113099 172.16.1.17 172.16.1.18 TCP 66 50171 → 445 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM
408 52.116108 172.16.1.18 172.16.1.17 TCP 66 445 → 50171 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256 SACK_PERM
409 52.116242 172.16.1.17 172.16.1.18 TCP 54 50171 → 445 [ACK] Seq=1 Ack=1 Win=262656 Len=0

// Protocol negotiation looks good
410 52.116317 172.16.1.17 172.16.1.18 SMB 127 Negotiate Protocol Request
411 52.118684 172.16.1.18 172.16.1.17 SMB2 306 Negotiate Protocol Response
412 52.118778 172.16.1.17 172.16.1.18 SMB2 342 Negotiate Protocol Request
413 52.122692 172.16.1.18 172.16.1.17 SMB2 430 Negotiate Protocol Response

// This looks different. Why?
441 52.149071 172.16.1.17 172.16.1.18 SMB2 467 Session Setup Request
443 52.152310 172.16.1.18 172.16.1.17 SMB2 314 Session Setup Response


Our deviation is in the SMB session setup. Looking at frame 441 in the working and frame 24 in the non-working. Enhance.

// Working
Frame 441: 467 bytes on wire (3736 bits), 467 bytes captured (3736 bits) on interface \Device\NPF_{F26B04EB-93FE-45B6-8E1F-7DED5BBC122C}, id 0
Ethernet II, Src: Microsoft_01:2b:07 (00:15:5d:01:2b:07), Dst: Microsoft_01:2b:08 (00:15:5d:01:2b:08)
Internet Protocol Version 4, Src: 172.16.1.17, Dst: 172.16.1.18
Transmission Control Protocol, Src Port: 50171, Dst Port: 445, Seq: 1822, Ack: 629, Len: 413
[2 Reassembled TCP Segments (1873 bytes): #440(1460), #441(413)]
NetBIOS Session Service
SMB2 (Server Message Block Protocol version 2)
SMB2 Header
Session Setup Request (0x01)
[Preauth Hash: 6bd47bbb381153ab91602e8af8506ede2982755b9d389e38aad82401d3126873df8aebef4ed5995f996a6cfad143fef8c8bf7c52c72787aad3ddf9122c67d0e7]
StructureSize: 0x0019
Flags: 0
Security mode: 0x01, Signing enabled
Capabilities: 0x00000001, DFS
Channel: None (0x00000000)
Previous Session Id: 0x0000000000000000
Blob Offset: 0x00000058
Blob Length: 1781
Security Blob [truncated]: 608206f106062b0601050502a08206e5308206e1a030302e06092a864882f71201020206092a864886f712010202060a2b06010401823702021e060a2b06010401823702020aa28206ab048206a7608206a306092a864886f71201020201006e8206923082068ea00302
GSS-API Generic Security Service Application Program Interface
OID: 1.3.6.1.5.5.2 (SPNEGO - Simple Protected Negotiation)
Simple Protected Negotiation
negTokenInit
mechTypes: 4 items
mechToken [truncated]: 608206a306092a864886f71201020201006e8206923082068ea003020105a10302010ea20703050020000000a38204d1618204cd308204c9a003020105a10d1b0b434f4e544f534f2e434f4da2233021a003020102a11a30181b04636966731b106d6230312e636f6e746f73
krb5_blob [truncated]: 608206a306092a864886f71201020201006e8206923082068ea003020105a10302010ea20703050020000000a38204d1618204cd308204c9a003020105a10d1b0b434f4e544f534f2e434f4da2233021a003020102a11a30181b04636966731b106d6230312e636f6e746f73
KRB5 OID: 1.2.840.113554.1.2.2 (KRB5 - Kerberos 5)
krb5_tok_id: KRB5_AP_REQ (0x0001)
Kerberos
ap-req
pvno: 5
msg-type: krb-ap-req (14)
Padding: 0
ap-options: 20000000
ticket
authenticator

// Non-working
Frame 24: 365 bytes on wire (2920 bits), 365 bytes captured (2920 bits) on interface \Device\NPF_{F26B04EB-93FE-45B6-8E1F-7DED5BBC122C}, id 0
Ethernet II, Src: Microsoft_01:2b:08 (00:15:5d:01:2b:08), Dst: Microsoft_01:2b:07 (00:15:5d:01:2b:07)
Internet Protocol Version 4, Src: 172.16.1.18, Dst: 172.16.1.17
Transmission Control Protocol, Src Port: 445, Dst Port: 49823, Seq: 629, Ack: 528, Len: 311
NetBIOS Session Service
SMB2 (Server Message Block Protocol version 2)
SMB2 Header
Session Setup Response (0x01)
[Preauth Hash: f32a9668f6fff82d8eec2a30d95b3c1804a299a8bf2dcb11049e215358b3a5789db9acc82f71fb4b6656004724d90c843927fd0b806cb1fdfef49c89fc3cf2a3]
StructureSize: 0x0009
Session Flags: 0x0000
Blob Offset: 0x00000048
Blob Length: 235
Security Blob [truncated]: a181e83081e5a0030a0101a10c060a2b06010401823702020aa281cf0481cc4e544c4d53535000020000000e000e0038000000158289e2d762f15851b5c9b2000000000000000086008600460000000a007c4f0000000f43004f004e0054004f0053004f0002000e0043
GSS-API Generic Security Service Application Program Interface
Simple Protected Negotiation
negTokenTarg
negResult: accept-incomplete (1)
supportedMech: 1.3.6.1.4.1.311.2.2.10 (NTLMSSP - Microsoft NTLM Security Support Provider)
responseToken [truncated]: 4e544c4d53535000020000000e000e0038000000158289e2d762f15851b5c9b2000000000000000086008600460000000a007c4f0000000f43004f004e0054004f0053004f0002000e0043004f004e0054004f0053004f00010008004d00420030003100040016006300
NTLM Secure Service Provider
NTLMSSP identifier: NTLMSSP
NTLM Message Type: NTLMSSP_CHALLENGE (0x00000002)
Target Name: CONTOSO
Negotiate Flags: 0xe2898215, Negotiate 56, Negotiate Key Exchange, Negotiate 128, Negotiate Version, Negotiate Target Info, Negotiate Extended Session Security, Target Type Domain, Negotiate Always Sign, Negotiate NTLM key, Negotiate Sign
NTLM Server Challenge: d762f15851b5c9b2
Reserved: 0000000000000000
Target Info
Version 10.0 (Build 20348); NTLM Current Revision 15


There is a big one. We are using Kerberos to authenticate in the working scenario and NTLM in the non-working scenario. I haven't talked about Kerberos and NTLM yet so we can just think about these as black boxes for now. But just know that if we access a resource via IP address instead of name, we will attempt to authenticate via NTLM. With that in mind, let's try and get an apples to apples to comparison.

20 15.819851 172.16.1.17 172.16.1.18 TCP 66 49782 → 445 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM
21 15.821828 172.16.1.18 172.16.1.17 TCP 66 445 → 49782 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256 SACK_PERM
22 15.821936 172.16.1.17 172.16.1.18 TCP 54 49782 → 445 [ACK] Seq=1 Ack=1 Win=262656 Len=0
23 15.821979 172.16.1.17 172.16.1.18 SMB 127 Negotiate Protocol Request
24 15.825852 172.16.1.18 172.16.1.17 SMB2 306 Negotiate Protocol Response
25 15.825963 172.16.1.17 172.16.1.18 SMB2 334 Negotiate Protocol Request
26 15.829827 172.16.1.18 172.16.1.17 SMB2 430 Negotiate Protocol Response
27 15.847369 172.16.1.17 172.16.1.18 SMB2 220 Session Setup Request, NTLMSSP_NEGOTIATE
28 15.849945 172.16.1.18 172.16.1.17 SMB2 365 Session Setup Response, Error: STATUS_MORE_PROCESSING_REQUIRED, NTLMSSP_CHALLENGE
29 15.852842 172.16.1.17 172.16.1.18 SMB2 651 Session Setup Request, NTLMSSP_AUTH, User: CONTOSO\will
30 15.861978 172.16.1.18 172.16.1.17 SMB2 159 Session Setup Response


Clear as day, if we use NTLM via IP address everything works. What is going on? We know that this is something that is unique to the name  data.contoso.com .

We have been able to dissect things down to:

When a CNAME record is in place, we cannot authenticate using NTLM to an SMB share. And with a quick Bing search, we found our answer: SMB file server share access is unsuccessful through DNS CNAME alias . That sounds right, right?



Let’s check the SMB server configuration.

PS C:\> Get-SmbServerConfiguration

<snip>
EnableStrictNameChecking : True
<snip>


That matches exactly with what the learn article describes. And if we follow the advice that was called out, we have two options:

  1. Stop using a CNAME record (aka update the SQL database backup string)
  2. Register the SPN for the CNAME (we will get more into SPNs when I talk about Kerberos)

Why is it slow!​


On a beautiful Monday morning, your colleague approaches you with the following problem. "Hey buddy, I've noticed that one of our data servers is seeing poor performance reading from the data store. Can you give me a hand?".

And you begin with some questions:

  • Q: When did you first start noticing this?
    • A: Started last week
  • Q: What changed around this time
    • A: We started splitting our data chunks into smaller files
  • Q: What was the performance before?
    • A: I'm not sure but it  feels  slower.
  • Q: What are the data store servers?
    • A: We have two. MB01 and MB02
  • Q: Are both affected?
    • A: No. Only MB01

Now SMB performance is tricky as there are many factors that come into play. We need to start by establishing a baseline. To do this, we will be using the command line tool  robocopy .

I will be using the following flags:

  • /NJH This is to remove the robocopy header to keep the output concise
  • /NJL This is to prevent the specific files from being listed

Starting with our baseline:

PS C:\temp\datasets> robocopy \\MB02.contoso.com\development\inputs\ . *.bin /NJH /NFL

101 \\MB02.contoso.com\development\inputs\

------------------------------------------------------------------------------

Total Copied Skipped Mismatch FAILED Extras
Dirs : 1 0 1 0 0 0
Files : 101 101 0 0 0 0
Bytes : 1.009 g 1.009 g 0 0 0 0
Times : 0:00:10 0:00:10 0:00:00 0:00:00


Speed : 105,494,087 Bytes/sec.
Speed : 6,036.420 MegaBytes/min.
Ended : Monday, May 20, 2024 9:00:55 AM


We have 101 files in a total of 10 seconds. Not bad. Notably, within SMB there is something known as the "Small Files Problem". In short, if we can get SMB to spend more time on transferring data and less time working with headers, then the transfer will be faster. For more details please see  Slow Transfer of Small Files . Let's run our test again, but with one  BIG  file.

PS C:\temp\datasets> robocopy \\MB02.contoso.com\development\ . *.bin /NJH /NFL

1 \\MB02.contoso.com\development\

------------------------------------------------------------------------------

Total Copied Skipped Mismatch FAILED Extras
Dirs : 1 0 1 0 0 0
Files : 1 1 0 0 0 101
Bytes : 1.009 g 1.009 g 0 0 0 1.009 g
Times : 0:00:08 0:00:08 0:00:00 0:00:00


Speed : 122,071,051 Bytes/sec.
Speed : 6,984.961 MegaBytes/min.
Ended : Monday, May 20, 2024 9:05:44 AM


A little bit quicker but not night and day. Cool. We have our baseline. How different is the slow server?

PS C:\temp\datasets> robocopy \\MB01.contoso.com\development\inputs\ . *.bin /NJH /NFL

101 \\MB01.contoso.com\development\inputs\

------------------------------------------------------------------------------

Total Copied Skipped Mismatch FAILED Extras
Dirs : 1 0 1 0 0 0
Files : 101 101 0 0 0 0
Bytes : 1.009 g 1.009 g 0 0 0 0
Times : 0:00:17 0:00:17 0:00:00 0:00:00


Speed : 63,702,961 Bytes/sec.
Speed : 3,645.113 MegaBytes/min.
Ended : Monday, May 20, 2024 9:09:09 AM


Oh my... This is a huge difference. How about one large file?

PS C:\temp\datasets> robocopy \\MB01.contoso.com\development\ . *.bin /NJH /NFL

1 \\MB01.contoso.com\development\

------------------------------------------------------------------------------

Total Copied Skipped Mismatch FAILED Extras
Dirs : 1 0 1 0 0 0
Files : 1 1 0 0 0 101
Bytes : 1.009 g 1.009 g 0 0 0 1.009 g
Times : 0:00:09 0:00:09 0:00:00 0:00:00


Speed : 115,875,544 Bytes/sec.
Speed : 6,630.452 MegaBytes/min.
Ended : Monday, May 20, 2024 9:12:38 AM


This is interesting... The data transfer is faster than the many files. But still slower than the known good server.

I think it is time for us to take a packet capture of the many small files.

// TCP 3-way handshake looks good
75 3.875690 172.16.1.17 172.16.1.18 TCP 66 49782 → 445 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM
76 3.877673 172.16.1.18 172.16.1.17 TCP 66 445 → 49782 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256 SACK_PERM
77 3.877793 172.16.1.17 172.16.1.18 TCP 54 49782 → 445 [ACK] Seq=1 Ack=1 Win=262656 Len=0

// SMB setup looks good
78 3.877834 172.16.1.17 172.16.1.18 SMB 127 Negotiate Protocol Request
79 3.879769 172.16.1.18 172.16.1.17 SMB2 306 Negotiate Protocol Response
80 3.879888 172.16.1.17 172.16.1.18 SMB2 342 Negotiate Protocol Request
81 3.883793 172.16.1.18 172.16.1.17 SMB2 430 Negotiate Protocol Response
93 3.894040 172.16.1.17 172.16.1.18 SMB2 467 Session Setup Request
95 3.897813 172.16.1.18 172.16.1.17 SMB2 314 Session Setup Response
105 3.904149 172.16.1.17 172.16.1.18 SMB2 190 Tree Connect Request Tree: \\MB01.contoso.com\development
106 3.907808 172.16.1.18 172.16.1.17 SMB2 138 Tree Connect Response

// We open a handle to the inputs subdirectory
113 3.918151 172.16.1.17 172.16.1.18 SMB2 218 Create Request File: inputs
114 3.921885 172.16.1.18 172.16.1.17 SMB2 266 Create Response File: inputs

// Searching for files in the directory
119 3.928079 172.16.1.17 172.16.1.18 SMB2 260 Find Request File: inputs SMB2_FIND_ID_BOTH_DIRECTORY_INFO Pattern: *;Find Request File: inputs SMB2_FIND_ID_BOTH_DIRECTORY_INFO Pattern: *
129 3.931927 172.16.1.18 172.16.1.17 SMB2 1022 Find Response;Find Response, Error: STATUS_NO_MORE_FILES

// Getting a handle to the first file
186 4.159137 172.16.1.17 172.16.1.18 SMB2 374 Create Request File: inputs\dataset_0.bin
187 4.160930 172.16.1.18 172.16.1.17 SMB2 378 Create Response File: inputs\dataset_0.bin

// Reading the contents
194 4.167610 172.16.1.17 172.16.1.18 SMB2 171 Read Request Len:1048576 Off:0 File: inputs\dataset_0.bin
195 4.167684 172.16.1.17 172.16.1.18 SMB2 171 Read Request Len:1048576 Off:1048576 File: inputs\dataset_0.bin
218 4.171116 172.16.1.17 172.16.1.18 SMB2 171 Read Request Len:1048576 Off:2097152 File: inputs\dataset_0.bin
219 4.171129 172.16.1.17 172.16.1.18 SMB2 171 Read Request Len:1048576 Off:3145728 File: inputs\dataset_0.bin
943 4.195287 172.16.1.18 172.16.1.17 SMB2 1514 Read Response
1141 4.195851 172.16.1.17 172.16.1.18 SMB2 288 Read Request Len:1048576 Off:5242880 File: inputs\dataset_0.bin
1678 4.201491 172.16.1.18 172.16.1.17 SMB2 1514 Read Response
1779 4.203320 172.16.1.17 172.16.1.18 SMB2 171 Read Request Len:1048576 Off:6291456 File: inputs\dataset_0.bin
2422 4.209516 172.16.1.18 172.16.1.17 SMB2 1514 Read Response
2752 4.210860 172.16.1.17 172.16.1.18 SMB2 288 Read Request Len:1048576 Off:8388608 File: inputs\dataset_0.bin
3158 4.213629 172.16.1.18 172.16.1.17 SMB2 1514 Read Response
3440 4.216298 172.16.1.17 172.16.1.18 SMB2 288 Read Request Len:251658 Off:10485760 File: inputs\dataset_0.bin
3898 4.219715 172.16.1.18 172.16.1.17 SMB2 1514 Read Response
4642 4.225274 172.16.1.18 172.16.1.17 SMB2 1514 Read Response

// Closing the handle
9619 4.431557 172.16.1.17 172.16.1.18 SMB2 146 Close Request File: inputs\dataset_0.bin
9620 4.482862 172.16.1.18 172.16.1.17 SMB2 182 Close Response
// Repeat for the other files
...


From the SMB layer, everything looks normal. Let's go a layer deeper (TCP) and see what we can see.

10394 4.715089 172.16.1.17 172.16.1.18 SMB2 171 Read Request Len:1048576 Off:7340032 File: inputs\dataset_1.bin
11128 4.717836 172.16.1.17 172.16.1.18 TCP 66 [TCP Dup ACK 10394#1] 49782 → 445 [ACK] Seq=8917 Ack=11826591 Win=4204800 Len=0 SLE=11968211 SRE=11969671
11129 4.717848 172.16.1.17 172.16.1.18 TCP 66 [TCP Dup ACK 10394#2] 49782 → 445 [ACK] Seq=8917 Ack=11826591 Win=4204800 Len=0 SLE=11968211 SRE=11971131
11130 4.717855 172.16.1.17 172.16.1.18 TCP 66 [TCP Dup ACK 10394#3] 49782 → 445 [ACK] Seq=8917 Ack=11826591 Win=4204800 Len=0 SLE=11968211 SRE=11972591
...
11861 4.722951 172.16.1.18 172.16.1.17 TCP 1514 [TCP Fast Retransmission] 445 → 49782 [ACK] Seq=11826591 Ack=8917 Win=2097408 Len=1460 [TCP segment of a reassembled PDU]


BINGO ! TCP retransmissions. We have packet loss! And when we look at the other side of our connection, we can see that the read request never arrived. With the read never arriving, the retransmission delays the delivery of data to the client. This trend of the TCP ACK from the client to the server being dropped continues throughout the trace.

With this inbound packet loss to MB01 our behavior makes more sense.

  1. When transferring lots of small files, there is lots of protocol overhead.
  • Client sends a request; server responds to the request
  • If the request is dropped, the process is delayed.
  1. When transferring one big file, the initial protocol work is done, then TCP sends as much data over the wire as it can stomach.
  • This leaves only the TCP ACKs being sent back to the server.

With this in mind, we chat with our network admin friends and ask them if the switch between these two endpoints is on the fritz. If so, let's get ourselves a new one.

SMB2.What?​




Picture it. Labor Day weekend. You have grand plans to do some grilling by the pool. But tragedy strikes. The on-call phone rings and your colleague Mary informs you that backups aren’t working. Time to investigate and see if we can save the weekend.

Starting with some questions:

  • Q: What is the problem?
    • A: Since Friday at 20:00, backups haven’t been running
  • Q: What changed around this time?
    • A: This is typically the change control Window so here is a list of what has changed.
      • Windows Updates were applied
      • New anti-virus software was installed
      • The old network switches were replaced
      • The security team disabled legacy behavior
  • Q: What is the name of the server?
    • A: MB01.contoso.com (It’s always something with this guy)

We’ll start with some simple tests:

  • Can I make a TCP connection to port 445?


    PS C:\> Test-NetConnection mb01.contoso.com -CommonTCPPort SMB

    ComputerName : mb01.contoso.com
    RemoteAddress : 172.16.1.18
    RemotePort : 445
    InterfaceAlias : Ethernet
    SourceAddress : 172.16.1.17
    TcpTestSucceeded : True


Yep TCP looks good.



If TCP looks good then we are likely dealing with an issue with a higher layer protocol (SMB, authentication, etc…).


Let’s try and reproduce the issue ourselves and see what we can see. Attempting to access \\MB01.contoso.com\backups via explorer gives us the following error:

large?v=v2&px=999.png







[Window Title]
File Explorer

[Content]
Windows can't find '\\MB01.contoso.com\Backups'. Check the spelling and try again.

[OK]


Got it. This error makes me think of the earlier issue where a share wasn’t actually shared. Let’s check with Get-SmbShare on the server.



PS C:\> Get-SmbShare

Name ScopeName Path Description
---- --------- ---- -----------
ADMIN$ * C:\Windows Remote Admin
backups * C:\Shares\backups
C$ * C:\ Default share
development * C:\Shares\development
IPC$ * Remote IPC


Nope. Backups is shared. I think it is time to dig into some packet capture analysis.

Here is our attempted connection to the server.

// TCP connection looks good (as we already knew)
419 22.789464 172.16.1.17 172.16.1.18 TCP 66 49808 → 445 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM
420 22.793431 172.16.1.18 172.16.1.17 TCP 66 445 → 49808 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256 SACK_PERM
421 22.793552 172.16.1.17 172.16.1.18 TCP 54 49808 → 445 [ACK] Seq=1 Ack=1 Win=262656 Len=0

// This looks bad
422 22.793600 172.16.1.17 172.16.1.18 SMB 127 Negotiate Protocol Request
423 22.797460 172.16.1.18 172.16.1.17 TCP 54 445 → 49808 [RST, ACK] Seq=1 Ack=74 Win=0 Len=0


The client sent out its SMB Negotiate, and the server responded by closing the connection with a TCP ACK RST. That seems odd.


Let’s take a closer look at the Negotiate request.

Frame 422: 127 bytes on wire (1016 bits), 127 bytes captured (1016 bits) on interface \Device\NPF_{F26B04EB-93FE-45B6-8E1F-7DED5BBC122C}, id 0
Ethernet II, Src: Microsoft_01:2b:07 (00:15:5d:01:2b:07), Dst: Microsoft_01:2b:08 (00:15:5d:01:2b:08)
Internet Protocol Version 4, Src: 172.16.1.17, Dst: 172.16.1.18
Transmission Control Protocol, Src Port: 49808, Dst Port: 445, Seq: 1, Ack: 1, Len: 73
NetBIOS Session Service
SMB (Server Message Block Protocol)
SMB Header
Negotiate Protocol Request (0x72)
Word Count (WCT): 0
Byte Count (BCC): 34
Requested Dialects
Dialect: NT LM 0.12
Dialect: SMB 2.002
Dialect: SMB 2.???



Really not a ton to see in here. We are advertising the SMB dialects we support and that is about it. We support:

  • NT LanManager 0.12 (In 2024 I certainly hope this isn’t the best dialect that is shared…)
  • SMB 2.002
  • And a SMB2 wild card

With this in mind, let’s try and take a look at our SMB server configuration using Get-SmbServerConfiguration to see if we can glean why, it wouldn’t accept these protocols.

PS C:\> Get-SmbServerConfiguration

<snip>
EnableSMB2Protocol : False
<snip>


What. Why is that disabled? Chatting with your colleague about why this was changed. “According to the security team, they were looking to disable SMB2 so that we would only use SMB3”. Ah… This is a common point of confusion.

SMB3 is a dialect of SMB2. If you disable SMB2 then you disable SMB3. Dialects with SMB are more like tweaks to the functionality rather than a wholistic change.

After re-enabling SMB2, the issue no longer reproduces, and Labor Day weekend is saved. Time to tan.



Wrap up

There are a lot of different things that we covered in this post but if there are any key takeaways it should be this.

  • Network file systems are complex.
  • It is subject to bottlenecks anywhere in it ' s path.
  • Remote file system
  • Local file system
  • Transportation layer (for a refresher see  TCP Connectivity  and  TCP Performance )
  • Authentication

And the good news is there is lots of great content from the very smart SMB folks. Here are my recommendations for continued learning:


But at the end of the day if we keep calm, ask good questions and follow the data, then we are going to be in good shape. Catch y’all next time!

Continue reading...
 
Back
Top