When SetNodeTags changed a node's tags, the node's self view wasn't
updated. The bug manifested as: the first SetNodeTags call updates
the server but the client's self view doesn't update until a second
call with the same tag.
Root cause: Three issues combined to prevent self-updates:
1. SetNodeTags returned PolicyChange which doesn't set OriginNode,
so the mapper's self-update check failed.
2. The Change.Merge function didn't preserve OriginNode, so when
changes were batched together, OriginNode was lost.
3. generateMapResponse checked OriginNode only in buildFromChange(),
but PolicyChange uses RequiresRuntimePeerComputation which
bypasses that code path entirely and calls policyChangeResponse()
instead.
The fix addresses all three:
- state.go: Set OriginNode on the returned change
- change.go: Preserve OriginNode (and TargetNode) during merge
- batcher.go: Pass isSelfUpdate to policyChangeResponse so the
origin node gets both self info AND packet filters
- mapper.go: Add includeSelf parameter to policyChangeResponse
Fixes#2978
User.Proto() was returning u.Name directly, which is empty for OIDC
users who have their identifier in the Email field instead. This caused
"headscale nodes list" to show empty user names for OIDC-authenticated
nodes.
Only fall back to Username() when Name is empty, which provides a
display-friendly identifier (Email > ProviderIdentifier > ID). This
ensures OIDC users display their email while CLI users retain their
original Name.
Fixes#2972
This commit replaces the ChangeSet with a simpler bool based
change model that can be directly used in the map builder to
build the appropriate map response based on the change that
has occured. Previously, we fell back to sending full maps
for a lot of changes as that was consider "the safe" thing to
do to ensure no updates were missed.
This was slightly problematic as a node that already has a list
of peers will only do full replacement of the peers if the list
is non-empty, meaning that it was not possible to remove all
nodes (if for example policy changed).
Now we will keep track of last seen nodes, so we can send remove
ids, but also we are much smarter on how we send smaller, partial
maps when needed.
Fixes#2389
Signed-off-by: Kristoffer Dalby <kristoffer@dalby.cc>
This commit changes so that node changes to the policy is
calculated if any of the nodes has changed in a way that might
affect the policy.
Previously we just checked if the number of nodes had changed,
which meant that if a node was added and removed, we would be
in a bad state.
Signed-off-by: Kristoffer Dalby <kristoffer@dalby.cc>
This PR changes tags to be something that exists on nodes in addition to users, to being its own thing. It is part of moving our tags support towards the correct tailscale compatible implementation.
There are probably rough edges in this PR, but the intention is to get it in, and then start fixing bugs from 0.28.0 milestone (long standing tags issue) to discover what works and what doesnt.
Updates #2417Closes#2619
When we fixed the issue of node visibility of nodes
that only had access to eachother because of a subnet
route, we gave all nodes access to all exit routes by
accident.
This commit splits exit nodes and subnet routes in the
access.
If a matcher indicates that the node should have access to
any part of the subnet routes, we do not remove it from the
node list.
If a matcher destination is equal to the internet, and the
target node is an exit node, we also do not remove the access.
Fixes#2784Fixes#2788
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
There are situations where the subnet routes and exit nodes
must be treated differently. This splits it so SubnetRoutes
only returns routes that are not exit routes.
It adds `IsExitRoutes` and `AllApprovedRoutes` for convenience.
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Correctly identify Viper's ConfigFileNotFoundError in LoadConfig to log a warning and use defaults, unifying behavior with empty config files. Fixes fatal error when no config file is present for CLI commands relying on environment variables.
This PR addresses some consistency issues that was introduced or discovered with the nodestore.
nodestore:
Now returns the node that is being put or updated when it is finished. This closes a race condition where when we read it back, we do not necessarily get the node with the given change and it ensures we get all the other updates from that batch write.
auth:
Authentication paths have been unified and simplified. It removes a lot of bad branches and ensures we only do the minimal work.
A comprehensive auth test set has been created so we do not have to run integration tests to validate auth and it has allowed us to generate test cases for all the branches we currently know of.
integration:
added a lot more tooling and checks to validate that nodes reach the expected state when they come up and down. Standardised between the different auth models. A lot of this is to support or detect issues in the changes to nodestore (races) and auth (inconsistencies after login and reaching correct state)
This PR was assisted, particularly tests, by claude code.
- tailscale client gets a new AuthUrl and sets entry in the regcache
- regcache entry expires
- client doesn't know about that
- client always polls followup request а gets error
When user clicks "Login" in the app (after cache expiry), they visit
invalid URL and get "node not found in registration cache". Some clients
on Windows for e.g. can't get a new AuthUrl without restart the app.
To fix that we can issue a new reg id and return user a new valid
AuthUrl.
RegisterNode is refactored to be created with NewRegisterNode() to
autocreate channel and other stuff.
Initial work on a nodestore which stores all of the nodes
and their relations in memory with relationship for peers
precalculated.
It is a copy-on-write structure, replacing the "snapshot"
when a change to the structure occurs. It is optimised for reads,
and while batches are not fast, they are grouped together
to do less of the expensive peer calculation if there are many
changes rapidly.
Writes will block until commited, while reads are never
blocked.
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
This patch includes some changes to the OIDC integration in particular:
- Make sure that userinfo claims are queried *before* comparing the
user with the configured allowed groups, email and email domain.
- Update user with group claim from the userinfo endpoint which is
required for allowed groups to work correctly. This is essentially a
continuation of #2545.
- Let userinfo claims take precedence over id token claims.
With these changes I have verified that Headscale works as expected
together with Authelia without the documented escape hatch [0], i.e.
everything works even if the id token only contain the iss and sub
claims.
[0]: https://www.authelia.com/integration/openid-connect/headscale/#configuration-escape-hatch
This commit changes most of our (*)types.Node to
types.NodeView, which is a readonly version of the
underlying node ensuring that there is no mutations
happening in the read path.
Based on the migration, there didnt seem to be any, but the
idea here is to prevent it in the future and simplify other
new implementations.
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
this commit moves all of the read and write logic, and all different parts
of headscale that manages some sort of persistent and in memory state into
a separate package.
The goal of this is to clearly define the boundry between parts of the app
which accesses and modifies data, and where it happens. Previously, different
state (routes, policy, db and so on) was used directly, and sometime passed to
functions as pointers.
Now all access has to go through state. In the initial implementation,
most of the same functions exists and have just been moved. In the future
centralising this will allow us to optimise bottle necks with the database
(in memory state) and make the different parts talking to eachother do so
in the same way across headscale components.
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* feat: add verify client config for embedded DERP
* refactor: embedded DERP no longer verify clients via HTTP
- register the `headscale://` protocol in `http.DefaultTransport` to intercept network requests
- update configuration to use a single boolean option `verify_clients`
* refactor: use `http.HandlerFunc` for type definition
* refactor: some renaming and restructuring
* chore: some renaming and fix lint
* test: fix TestDERPVerifyEndpoint
- `tailscale debug derp` use random node private key
* test: add verify clients integration test for embedded DERP server
* fix: apply code review suggestions
* chore: merge upstream changes
* fix: apply code review suggestions
---------
Co-authored-by: Kristoffer Dalby <kristoffer@dalby.cc>
* Make matchers part of the Policy interface
* Prevent race condition between rules and matchers
* Test also matchers in tests for Policy.Filter
* Compute `filterChanged` in v2 policy correctly
* Fix nil vs. empty list issue in v2 policy test
* policy/v2: always clear ssh map
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
---------
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Co-authored-by: Aras Ergus <aras.ergus@tngtech.com>
Co-authored-by: Kristoffer Dalby <kristoffer@tailscale.com>
* types/authkey: include user object, not string
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* make preauthkeys use id
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* changelog
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* integration: wire up user id for auth keys
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
---------
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* add casbin user test
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* Delete double slash
* types/users: use join url on iss that are ursl
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
---------
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Co-authored-by: Juan Font <juanfontalonso@gmail.com>
Tailscale allows to override the local DNS settings of a node via
"Override local DNS" [1]. Restore this flag with the same config setting
name `dns.override_local_dns` but disable it by default to align it with
Tailscale's default behaviour.
Tested with Tailscale 1.80.2 and systemd-resolved on Debian 12.
With `dns.override_local_dns: false`:
```
Link 12 (tailscale0)
Current Scopes: DNS
Protocols: -DefaultRoute -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
DNS Servers: 100.100.100.100
DNS Domain: tn.example.com ~0.e.1.a.c.5.1.1.a.7.d.f.ip6.arpa [snip]
```
With `dns.override_local_dns: true`:
```
Link 12 (tailscale0)
Current Scopes: DNS
Protocols: +DefaultRoute -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
DNS Servers: 100.100.100.100
DNS Domain: tn.example.com ~.
```
[1] https://tailscale.com/kb/1054/dns#override-local-dnsFixes: #2256