New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dotnet restore fails on Linux 4.6.2 #6502
Comments
This behavior reproduced even with versions of This does not appear to be the same as #6500. |
The backtrace in GDB was not useful to me:
|
/cc @piotrpMSFT @livarcocc @ericstj It sounds like you probably don't have a required prerequisite on the machine, but I would have no idea which one... I'm not even sure of the list of required prereqs on CentOS. |
As part of the prereqs I've previously identified for CentOS, I have installed:
This was after taking the package dependencies from the old Ubuntu |
Can you check all the config that happens in https://github.com/dotnet/cli/blob/rel/1.0.0/scripts/docker/centos/Dockerfile? That sets up our centos docker images for our CI. |
Obviously you shouldn't need 'make' or any C++ compiler, those are needed for building the CLI and Shared Framework code itself. |
Checked the prereqs, already have them all. Really don't think I need the clang, lldb, and llvm packages. $ sudo yum install unzip libunwind gettext libcurl-devel openssl-devel zlib libicu-devel
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: mirrors.syringanetworks.net
* elrepo: ftp.osuosl.org
* epel: mirrors.cat.pdx.edu
* extras: mirrors.xmission.com
* updates: mirrors.cat.pdx.edu
Package unzip-6.0-15.el7.x86_64 already installed and latest version
Package 2:libunwind-1.1-5.el7_2.2.x86_64 already installed and latest version
Package gettext-0.18.2.1-4.el7.x86_64 already installed and latest version
Package libcurl-devel-7.29.0-25.el7.centos.x86_64 already installed and latest version
Package 1:openssl-devel-1.0.1e-51.el7_2.5.x86_64 already installed and latest version
Package zlib-1.2.7-15.el7.x86_64 already installed and latest version
Package libicu-devel-50.1.2-15.el7.x86_64 already installed and latest version
Nothing to do |
@piotrpMSFT said they'd recently identified and fixed some compression library problems in CoreFX. |
The version you are using is the build of .NET Core 1.0.0 RTM and the SDK Preview2 that is going to be released next week. These should be the "golden bits". Just to ensure everything is right: where did you get |
@eerhardt yes I downloaded it today, let me re-download it and test. |
Deleted (We have these steps automated for our CI, they should definitely be right, but I re-ran them by hand to be sure.) |
Here's the last bit of the
|
I found it: This segmentation fault occurs when running with the mainline kernel, 4.6.2, but not with 3.10.0 (CentOS 7's default). Now why decompression fails on an up-to-date kernel is beyond me, but I'll change this name of this issue. |
Hey wouldn't you know it: I can reproduce this on Ubuntu 14.04 with the 4.6.2 image from kernel.ubuntu.com. At least it's consistent! This also tells me that it doesn't repro on 4.2.0 either. |
@andschwa, do you have similar symptoms with updated kernel as http://unix.stackexchange.com/q/253903? Then these might be related: Answer: http://unix.stackexchange.com/a/255603 |
I'm not seeing the same errors as the Stack Exchange question. I have this in dmesg when it crashes:
I've ruled out the Hypervisor and any extensions, as the CentOS VM is on VirtualBox (with guest additions) and the Ubuntu VM is on Hyper-V (with kernel LIS drivers). Running through valgrind didn't give me much, Attempting to get a coredump. |
Coredumps obtained for CentOS (and Debian), emailing them to you @ellismg. Fortunately, as I know how to reproduce the problem, I can downgrade my kernel and move on with work. Nonetheless, we should figure out why 4.6.x kernels and |
I vote we keep this one. |
Let me take a look. |
I have installed the 4.6.2 kernel to my Ubuntu VM and found it has nothing to do with the cli. The runtime itself doesn't work on that linux kernel. Even attempt to build coreclr repo fails with SIGSEGV in GC when building mscorlib.dll / System.Private.CoreLib.dll. |
I can confirm that Debian stretch (4.6.0-1-amd64) is also affected with 1.0.0-preview2-003121. |
@klinkby, PR that fixes the issue: dotnet/coreclr#6027. |
Any chance the fix can be pushed out in a hotfix patch? This is currently preventing us from upgrading to 1.0 |
Having the same issue on Fedora 24 after fixing the libicu version issues.
|
@cruz82 What version of the kernel are you using? |
I'm on 4.6.3-300 |
There was a bug in 1.0.0, that manifests itself as a crash like the above on recent kernels (IIRC 4.6.0+). We have a fix in the runtime but no official release of it yet. If you want to rebuild the CoreCLR part from source I can show you the commit to cherry-pick. |
We're struggling with this also - we're seeing some instabilities in some apps running in Docker here after the underlying hosts (CoreOS) received some updates to the kernel recently. We're unable to run new builds right now (also happening in the containers) and already built apps are crashing fairly often. We can rebuild the CLR from source based on the latest preview2 if we can get some more information about which commits fixes this. |
I'm still running into this problem with
I'm not sure how to create or where to find crash dumps. I have installed a few backport/sid packages as I need them for other things running on the same machine. That said, running Here are some currently installed packages that might be relevant:
|
I am also running into this problem on Debian Stretch. So much for getting started with learning C# :( |
@sekjun9878 - please see this conversation for Debian Stretch: dotnet/core#649 (comment) |
@eerhardt Thanks for the reference, but it's not the same issue. I can run |
The general thread talks about that .NET Core 2.0 doesn't support Debian Stretch unless you install OpenSSL 1.0.x. Do you have that installed on the machine? Debian Stretch comes with OpenSSL 1.1, which .NET Core doesn't support yet. So you'll get a segfault anywhere that tries using HTTPS or SSL. |
@eerhardt as you can see from my post, there were various versions of libssl installed on my machine, two of which are 1.0.x versions in 64 and 32 bit variants, but it still segfaulted. Here are the relevant files those packages result in:
Maybe |
Can you do an |
Sure: https://pastebin.com/s0NqFQDD As far as I can tell it does load |
@bartonjs - any thoughts? |
My psychic debugger is getting a weak ping off of libcurl+openssl11 (since I see libssl.so.1.1 installed) @cobrafast: What curl/libcurl do you have installed? $ curl --version
curl 7.47.0 (x86_64-pc-linux-gnu) libcurl/7.47.0 GnuTLS/3.4.10 zlib/1.2.8 libidn/1.32 librtmp/2.3
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS IDN IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz TLS-SRP UnixSockets In the case of the machine I happen to be logged in to right now, the TLS provider is |
@bartonjs My curl is newer, but it says
|
Okay, libcurl is the problem. Or, at least the packages are. But it's a 1.0 vs 1.0 problem. At line 224: Followed (at 232) by:
Okay, we've loaded libcrypto.so.1.0.0. Then, down at 544: 552:
And.... 580:
So we're calling X509_verify in libcrypto.so.1.0.0, but giving it a pointer created by libcrypto.so.1.0.2. And something between the two different compilations has clearly changed the offset of some value (or interpretation thereof). I don't know why Debian has packaged them side by side like that. Your system devel says that "1.0.0" is the SONAME to bind (/usr/lib/x86_64-linux-gnu/libssl.so -> libssl.so.1.0.0). If there isn't a libcurl which is bound to
|
Just to confirm, Debian Stretch does not ship with libssl1.0.0. It was an artifact package required by some userspace apps from the previous release like skype (libssl1.0.0:i386) and rstudio and ruby2.1 (libssl1.0.0:amd64) in my case. I've removed both of those packages and now I'm getting this error:
Is dotnet core v1 supposed to support libssl1.0.2 or do I need libssl1.0.0? |
@sekjun9878 If you build locally, any 1.0.x build is fine. The prebuilts, which were built on/for Jessie, require libssl.1.0.0. .NET Core 2.0 has added more flexibility, and will work prebuilt on both Jessie and Stretch. |
It looks like this will be supported in the future: dotnet/corefx@4d7ee84 But the 1.0.x branch links against |
I just tried the .NET Core 2.0 Preview 1 package as per https://www.microsoft.com/net/core/preview#linuxdebian and the problem seems to persist.
|
@cobrafast just to be sure, this is on Jessie, right? |
@ellismg it's the same system I've been talking about previously (i.e. https://github.com/dotnet/cli/issues/3681#issuecomment-298802996) |
@cobrafast Support for Jessie was fixed with dotnet/corefx@4d7ee84, but that happened after the preview1 build. You'll need to try nightlies, or wait for the next release. |
Issue seems to persist with the build from the dotnet debian package repository.
|
Installing packages from the jessie repo doesn't help either. Perhaps we can get this issue reopened? |
@cobrafast can you please run |
@janvorli sure:
|
@cobrafast the curl was built against OpenSSL 1.0.2l and according to your list you seem to have OpenSSL 1.0.1t installed. Do you happen to have the 1.0.2l package installed too? I am not sure about Debian 8, but in Debian 9, the libssl 1.0.2 has a different soname than libssl 1.0.1, those two can be installed side by side and if they end up in the same process (e.g. if one was pulled in by libcurl and the other by .NET native libraries), a threading issues can happen leading to crashes that may be what you are getting. |
@janvorli hmm, there are several
Of those I was able to remove The |
Seems like downgrading |
The proc map shows I was right - both the libssl.so.1.0.0 and libssl.so.1.0.2 were loaded at the same time. It seems that I'll need to add an env var that would allow people to override the libssl version that we use to allow fixing these corner cases. |
In the meantime, I've tried to symlink /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0 to /usr/lib/x86_64-linux-gnu/libssl.so.1.0.2 . Do it only at your own risk, this might introduce more issues than it fixes. So far, it looks good, though. |
@v6ak on Debian Jessie, the 1.0.1 is the supported version. See https://packages.debian.org/search?suite=jessie&searchon=names&keywords=openssl. The CURL that @cobrafast was using is not the version that's the standard version for Debian Jessie. His version was 7.55 while Debian Jessie has version 7.38. See The Debian Stretch is the one that has moved to 1.0.2. |
Steps to reproduce
CentOS 7 with
kernel-ml
installed, the mainline Linux kernel at version 4.6.2../dotnet-install.sh -c preview -v 1.0.0-preview2-003121 ~/.dotnet/dotnet restore
Expected behavior
Packages to be restored.
Actual behavior
Environment data
dotnet --info
output:This is a fresh CentOS 7 machine I made to replace the Debian Testing machine I couldn't build on in #6483.
The text was updated successfully, but these errors were encountered: